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ASPEGILLUS NIGER p-GLUCOSIDASE GENE, PROTEIN AND USES 

THEREOF 

FIELD AND BACKGROUND OF THE INVENTION 

5 The present invention relates to a polypeptide having P-glucosidase 

enzymatic activity, to a polynucleotide encoding the polypeptide, to nucleic 
acid constructs carrying the polynucleotide, to transformed or infected cells, 
such as yeast cells, and organisms expressing the polynucleotide and to various 
uses of the polypeptide, the polynucleotide, cells and/or organisms, including, 

10 but not limited to, producing a recombinant polypeptide having P-glucosidase 
enzymatic activity, increasing the level of aroma compounds in alcoholic 
beverages, as well as other fermentation products of plant material, 
hydrolyzing cellobiose and thus increasing the level of fermentable glucose, to 
increase production of alcohol, such as ethanol from plant material, increasing 

15 the aroma released from a plant or a plant product, and hydrolysis or 
transglycosylation of glycosides. 

Abbreviations used herein include: BGL1 - Aspergillus niger Bl p- 
glucosidase; bgll - a cDNA encoding same; 2FGlcF - 2-deoxy-2-fluoro P- 
glucosyl fluoride; DNP - 2,4-dinitrophenol; DNPGlc - 2,4-dinitrophenyl (J-D- 

20 glucopyranoside; pNP - p-nitrophenol; pNPGlc - p-nitrophenyl p-D- 
glucopyranoside; MUGlc - 4-methylumbeliferyl-p-D-glucopyranoside; YNB - 
yeast nitrogen base without amino acids; and X-glu - 5-bromo-4-chloro-3- 
indolyl P-D-glucopyranoside. 

p-Glucosidases (EC 3.2.1.21; P-D-glucoside glucohydrolase) play a 

25 number of different important roles in biology, including the degradation of 
cellulosic biomass by fungi and bacteria, degradation of glycolipids in 
mammalian lysosomes and the cleavage of glucosylated flavonoids in plants. 
These enzymes are therefore of considerable industrial interest, not only as 
constituents of cellulose-degrading systems, but also in the food industry (2, 

30 3). 
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Aspergillus species are known as a useful source of P-glucosidases (4- 
6), and Aspergillus niger is by far the most efficient producer of P-glucosidase 
among the microorganisms investigated (4). Shoseyov et al. (7) have 
previously described a P-glucosidase from Aspergillus niger Bl (CMI CC 

5 324626) which is active at low pHs, as well as in the presence of high ethanol 
concentrations. This enzyme effectively hydrolyzes flavor-compound 
glycosides in certain low-pH products, such as wine and passion fruit juice, 
thereby enhancing their flavor (8-12), and is particularly attractive for use in 
the food industry, as A. niger is considered non-toxic (3). In addition, (J- 

10 glucosidase was found useful in enzymatic synthesis of glycosides (13-15). 
Other A. niger p-glucosidases have also been purified (16-18), however, 
differences in their properties have been reported, including ranges of 
molecular weights (116-137 kDa), isoelectric points (pi values of 3.8-4) and 
pH optima (3.4-4.5). Indeed, at least two P-glucosidases, with distinct 

15 substrate specificities, have been identified in commercial A. niger p- 
glucosidase preparations (19). Attempts to clear this confusion by cloning and 
expression of a functional A. niger p-glucosidase gene in S. cerevisioe has 
been previously reported (20), however the protein was not characterized, and 
the sequence was not published. 

20 Glycosidases have been assigned to families on the basis of sequence 

similarities, there now being some 77 different such families defined 
containing over 2,000 different enzymes (21, see also http://afmb.cnrs- 
mrs.fr/-pedro/CAZY/db.html). With the exception of the glucosylceramidases 
(Family 30), all simple p-glucosidases belong to either Family 1 or 3. Family 

25 1 contains enzymes from bacteria, plants and mammals, including also 6- 
phospho-glucosidases and thioglucosidases. Furthermore, most Family 1 
enzymes also have significant galactosidase activity. Family 3 contains P- 
glucosidases and hexosaminidases of fungal, bacterial and plant origin. 
Enzymes of both families hydrolyze their substrates with net retention of 
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anomeric configuration, presumably via a two-step, double-displacement 
mechanism, involving two key active site carboxylic acid residues (for reviews 
of mechanism, see 22-24). In the first step, one of the carboxylic acids (the 
nucleophile) attacks at the substrate anomeric center, while the other (the 

5 acid/base catalyst) protonates the glycosidic oxygen, thereby assisting the 
departure of the aglycone. This results in the formation of a covalent a- 
glycosyl-enzyme intermediate. In a second step this intermediate is hydrolyzed 
by general base-catalyzed attack of water at the anomeric center of the 
glycosyl-enzyme, to release the P-glucose product and regenerate free enzyme. 

10 Both the formation and the hydrolysis of this intermediate proceed via 
transition states with substantial oxocarbenium ion character. 

Given that Family 3 contains fungal enzymes of similar mass, including 
those from other Aspergillus sp., it is likely that the Aspergillus niger p- 
glucosidase would be a member of this family. Mechanistic information on 

15 this family is relatively sparse: the best characterized being the glycosylated 
170 kDa p-glucosidase from Aspergillus wentii. By labeling the active site 
with conduritol B-epoxide, this enzyme was shown to carry out hydrolysis, 
with net retention of anomeric configuration. This study has demonstrated that 
the labeled aspartic acid residue was the same as that derivatized by the slow 

20 substrate D-glucal (1, 25). Furthermore, it was shown that the 2- 
deoxyglucosyl-enzyme, trapped by use of D-glucal, was kinetically identical to 
that formed during the hydrolysis of PNP-2-deoxy-p-D- glucopyranoside (26). 
Further detailed kinetic analysis of the enzyme was performed by Legler et aL 
(27), including measurement of Hammett relationships, kinetic isotope effects 

25 and studies of the binding of potent reversible inhibitors, such as 
gluconolactone and nojirimycin. 

While reducing the present invention to practice, the p-glucosidase 
protein was isolated from Aspergillus niger, purified, cloned, sequenced, 
expressed in yeast host cells and its enzymatic function characterized. In 
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addition, the protein as well as signal peptide fused thereto and optionally an 
endoplasmic reticulum retaining peptide fused thereto were expressed in 
transgenic plants and the release of aroma substances therefrom following 
homogenization monitored. The enzyme encoded by the isolated gene, as 

5 described above, is of known usefulness in plant and/or plant products, as well 
as in biotechnological processes, including the food industry. Several 
unexpected advantages were uncovered, including, but not limited to, pH and 
temperature stability of the P-glucosidase from Aspergillus niger, requirement 
for a signal peptide for obtaining catalytic activity when expressed in plants. 

io Advantage for an endoplasmic retaining peptide or for a lack thereof when 
expressed in plants, depending on the application. 

SUMMARY OF THE INVENTION 

According to one aspect of the present invention there is provided an 
15 isolated nucleic acid comprising a genomic, complementary or composite 
polynucleotide preferably being derived from Aspergillus niger, encoding a 
polypeptide having a P-glucosidase catalytic activity and preferably further 
encoding, in frame, a signal peptide and an endoplasmic reticulum retaining 
peptide. 

20 According to another aspect of the present invention there is provided a 

recombinant protein comprising a polypeptide having a P-glucosidase catalytic 
activity, the polypeptide is preferably derived from Aspergillus niger and it 
preferably fused to a signal peptide and optionally also to an endoplasmic 
reticulum retaining peptide. 

25 According to yet another aspect of the present invention there is 

provided a nucleic acid construct comprising the isolated nucleic acid 
described herein. 

According to still another aspect of the present invention there is 
provided host cell or an organism, such as a plant, comprising the nucleic acid 
30 or nucleic acid construct described herein. 
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According to further features in preferred embodiments of the invention 
described below, the polynucleotide is as set forth in SEQ ID NOs:l, 3 or a 
portion thereof. 

According to still further features in the described preferred 
5 embodiments, the nucleic acid construct further comprising at least one cis 

acting control element for regulating expression of the polynucleotide. 

According to still further features in the described preferred 

embodiments, the host cell is selected from the group consisting of a 

prokaryotic cell and a eukaryotic cell. 
10 According to still further features in the described preferred 

embodiments the prokaryotic cell is E. coli. 

According to still further features in the described preferred 

embodiments the eukaryotic cell is selected from the group consisting of a 

yeast cell, a fungous cell, a plant cell and an animal cell. 
15 According to still further features in the described preferred 

embodiments the polypeptide is as set forth in SEQ ID NO: 2 or a portion 

thereof having the P-glucosidase catalytic activity. 

According to an additional aspect of the present invention there is 

provided a method of producing recombinant P-glucosidase, the method 
20 comprising the step of introducing, in an expressible form, a nucleic acid 

construct into a host cell, the nucleic acid construct including a genomic, 

complementary or composite polynucleotide preferably derived from 

Aspergillus niger, encoding a polypeptide having a P-glucosidase catalytic 

activity and preferably further encoding, in frame, a signal peptide and an 
25 endoplasmic reticulum retaining peptide. 

According to further features in preferred embodiments of the invention 

described below, the method further comprising the step of extracting the 

polypeptide having the P-glucosidase catalytic activity. 
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According to yet an additional aspect of the present invention there is 
provided a method of producing a recombinant P-glucosidase overexpressing 
cell, the method comprising the step of introducing, in an overexpressible 
form, a nucleic acid construct into a host cell, the nucleic acid construct 
including a genomic, complementary or composite polynucleotide preferably 
derived from Aspergillus niger, encoding a polypeptide having a P-glucosidase 
catalytic activity and preferably further encoding, in frame, a signal peptide 
and an endoplasmic reticulum retaining peptide. 

According to still an additional aspect of the present invention there is 
provided a method of increasing a level of at least one fermentation substance 
in a fermentation product, the method comprising the step of fermenting a 
glucose containing fermentation starting material by a yeast cell 
overexpressing a nucleic acid construct including a genomic, complementary 
or composite polynucleotide being preferably derived from Aspergillus niger, 
encoding a polypeptide having a p-glucosidase catalytic activity and preferably 
further encoding, in frame, a signal peptide and an endoplasmic reticulum 
retaining peptide, thereby increasing the level of the at least one fermentation 
substance in the fermentation product. 

According to a further aspect of the present invention there is provided 
a method of increasing a level of at least one fermentation substance in a 
fermentation product, the method comprising the step of fermenting a plant 
derived glucose containing fermentation starting material by a yeast cell, the 
plant overexpressing a nucleic acid construct including a genomic, 
complementary or composite polynucleotide preferably derived from 
Aspergillus niger, encoding a polypeptide having a p-glucosidase catalytic 
activity and preferably further encoding, in frame, a signal peptide and an 
endoplasmic reticulum retaining peptide, thereby increasing the level of the at 
least one fermentation substance in the fermentation product. 

According to a further aspect of the present invention there is provided 
a method of increasing a level of at least one aroma substance in a plant 
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derived P r6duct, the method comprising the step of incubating a glucose 
containing plant starting material with a yeast cell overexposing. a nucleic 
acid construct including a genomic, complementary or composite 
polynucleotide preferably derived from Aspergillus niger, encoding a 
polypeptide having a P-glucosidase catalytic activity and preferably further 
encoding, in frame, a signal peptide and an endoplasmic reticulum retaining 
peptide, thereby increasing the level of the at least one aroma substance in the 
plant derived product. 

According to yet a further aspect of the present invention there is 
provided a method of increasing a level of at least one aroma substance in a 
plant derived product, the method comprising the step of incubating a glucose 
containing plant starting material with a yeast cell, said plant overexposing a 
nucleic acid construct including a genomic, complementary or composite 
polynucleotide preferably derived from Aspergillus niger, encoding a 
polypeptide having a P-glucosidase catalytic activity and preferably further 
encoding, in frame, a signal peptide and an endoplasmic reticulum retaining 
peptide, thereby increasing the level of the at least one aroma substance in the 
plant derived product. 

According to still further features in the described preferred 
embodiments the plant derived product is a fermentation product, such as, but 
not limited to, an alcoholic beverage. 

According to still a further aspect of the present invention there is 
provided a method of increasing a level of free glucose in a glucose containing 
fermentation starting material, the method comprising the step of fermenting 
the glucose containing fermentation starting material by a cell overexpressing 
a nucleic acid construct including a genomic, complementary or composite 
polynucleotide preferably derived from Aspergillus niger, encoding a 
polypeptide having a p-glucosidase catalytic activity and preferably further 
encoding, in frame, a signal peptide and an endoplasmic reticulum retaining 
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peptide, thereby increasing the level of the free glucose in the glucose 
containing fermentation starting material. 

According to another aspect of the present invention there is provided a 
method of increasing a level of free glucose in a plant derived glucose 
5 containing fermentation starting material, the method comprising the step of 
fermenting the plant derived glucose containing fermentation starting material 
by a cell, the plant overexpressing a nucleic acid construct including a 
genomic, complementary or composite polynucleotide preferably derived from 
Aspergillus niger, encoding a polypeptide having a P-glucosidase catalytic 
io activity and preferably further encoding, in frame, a signal peptide and an 
endoplasmic reticulum retaining peptide, thereby increasing the level of the 
free glucose in the plant. 

According to yet another aspect of the present invention there is 
provided a method of increasing a level of free glucose in a plant, the method 
15 comprising the step of overexpressing in the plant a nucleic acid construct 
including a genomic, complementary or composite polynucleotide preferably 
derived from Aspergillus niger, encoding a polypeptide having a P-glucosidase 
catalytic activity and preferably further encoding, in frame, a signal peptide 
and an endoplasmic reticulum retaining peptide, thereby increasing the level of 
20 the free glucose in the plant. 

According to still another aspect of the present invention there is 
provided a method of producing an alcohol, the method comprising the step of 
fermenting a glucose containing fermentation starting material by a cell 
overexpressing a nucleic acid construct including a genomic, complementary 
25 or composite polynucleotide preferably derived from Aspergillus niger, 
encoding a polypeptide having a p-glucosidase catalytic activity and preferably 
further encoding, in frame, a signal peptide and an endoplasmic reticulum 
retaining peptide, and extracting the alcohol therefrom. 

According to an additional aspect of the present invention there is 
30 provided a method of producing an alcohol, the method comprising the step of 
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fermenting a plant derived glucose containing fermentation starting material by 
a cell, the plant overexpressing a nucleic acid construct including a genomic, 
complementary or composite polynucleotide preferably derived from 
Aspergillus niger, encoding a polypeptide having a P-glucosidase catalytic 

5 activity and preferably further encoding, in frame, a signal peptide and an 
endoplasmic reticulum retaining peptide, and extracting the alcohol therefrom. 

According to an additional aspect of the present invention there is 
provided a method of producing an aroma spreading plant, the method 
comprising the step of overexpressing in the plant a nucleic acid construct 

10 including a genomic, complementary or composite polynucleotide preferably 
derived from Aspergillus niger, encoding a polypeptide having a (3-glucosidase 
catalytic activity and preferably further encoding, in frame, a signal peptide 
and an endoplasmic reticulum retaining peptide, thereby increasing aroma, 
spread from the plant. 

15 According to further features in preferred embodiments of the invention 

described below, overexpressing the nucleic acid construct is performed in a 
tissue specific manner. 

According to still further features in the described preferred 
embodiments overexpressing the nucleic acid construct is limited to at least 

20 one tissue selected from the group consisting of flower, fruit, seed, root, stem, 
pollen and leaves. 

The present invention successfully addresses the shortcomings of the 
presently known configurations by providing a polypeptide having (J- 
glucosidase enzymatic activity, a polynucleotide encoding the polypeptide, a 

25 nucleic acid constructs carrying the polynucleotide, transformed or infected 
cells, such as yeast cells, and organisms expressing the polynucleotide and 
various uses of the polypeptide, the polynucleotide, cells and/or organisms, 
including, but not* limited to, producing a recombinant polypeptide having (3- 
glucosidase enzymatic activity, increasing the level of aroma compounds in 

30 alcoholic beverages, as well as other fermentation products of plant material, 
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hydrolyzing cellobiose and thus increasing the level of fermentable and/or free 
glucose, to increase production of a fermentation product, such as ethanol from 
plant material, increasing the aroma released from a plant or a plant product, 
and hydrolysis or transglycosylation of glycosides. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention is herein described, by way of example only, with 
reference to the accompanying drawings. With specific reference now to the 
drawings in detail, it is stressed that the particulars shown are by way of 
example and for purposes of illustrative discussion of the preferred 
embodiments of the present invention only, and are presented in the cause of 
providing what is believed to be the most useful and readily understood 
description of the principles and conceptual aspects of the invention. In this 
regard, no attempt is made to show structural details of the invention in more 
detail than is necessary for a fundamental understanding of the invention, the 
description taken with the drawings making apparent to those skilled in the art 
how the several forms of the invention may be embodied in practice. 

In the drawings: 

FIGs. la-c demonstrate plasmid maps employed as expression vectors 
for bgll cDNA. Figure la - E. coli expression vector containing bgll cDNA, 
inserted into the Ncol/BamUl sites of pET3d. Figure lb - S. cerevisiae 
expression vector containing bgll cDNA, inserted into the Hindlll/BamHl 
sites of pYES2-bgll plasmid. Figure lc - P. pastoris expression vector 
containing bgll cDNA, inserted into the EcoW/Bamlll sites of pHIL-Sl. 

FIGs. 2a-b demonstrates SDS-PAGE analysis of active protein samples 
eluted from a Mono-Q column, stained with coomassie blue (Figure 2a), or 0- 
glucosidase zymogram (Figure 2b) using MUGlc as a substrate. Lanes (for 
both Figures 2a and 2b): 1 - Electroeluted band of BGL1 from preparative 
PAGE-SDS gel stabs; 2, 3, 4, 5 - acetone precipitates from Mono-Q separation 
ofBGLl. 
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FIG. 3 demonstrates SDS-PAGE analysis of purified P-glucosidase by 
Mono-Q and Resource-S. Lanes: 1 - crude (27.5 ^ig protein); 2 - active 
fraction after Mono-Q (7 ng protein); and 3 - active fraction after Resource-S 
(10 jag protein). 

5 FIG. 4 demonstrates SDS-PAGE analysis of (3-glucosidase 

deglycosylated by N-glycosidase-F. Lanes: 1 - molecular weight marker; 2 - 
native p-glucosidase; and 3 - deglycosylated protein. 

FIG. 5a demonstrates the DNA and amino acid sequences of bgll. 
Amino acid sequences determined by Edman degradation are underlined. 
10 DNA sequences of introns are underlined. Signal peptide is indicated by italic 
letters. 

FIG 5b. demonstrates bgll gene organization. Exons (El -7) are 
indicated by filled boxes, introns by solid lines, restriction sites and the stop 
codon by arrows. 

15 FIG. 6a demonstrates a Western blot analysis of recombinant BGL1 

expressed in S. cerevisiae. Lanes: 1 - native BGL1 (positive control); 2 - total 
protein extract of S. cerevisiae expressing recombinant BGL1; 3 - total protein 
extract of S. cerevisiae without the bgll expression vector (negative control). 
FIG. 6b demonstrates a Western blot analysis of recombinant BGL1 

20 secreted from P. pastoris. Lanes: 1 - molecular weight marker; 2 - medium 
supernatant of P. pastoris expressing recombinant BGL1; 3 - medium 
supernatant of P. pastoris host without the vector (negative control). 

FIG. 7 demonstrates proton-NMR spectra, illustrating the 
stereochemical course of pNPGlc hydrolysis by A. niger p-glucosidase. 

25 Spectra are for the anomeric proton region of the substrate at different time 
intervals relative to addition of the enzyme. 

FIG. 8 demonstrates inactivation of recombinant BGL1 by 2FGlcF. 
Pure enzyme was incubated in the presence of various concentrations of the 
inactivator, and residual enzyme activity was determined at different time 
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intervals. Residual activity is presented, semilogarithmically, versus time, in 
the presence of the indicated concentrations of inactivator. 
[ FIG. 9 demonstrates reactivation of 2-deoxy-2-fluoroglucosyl- 

recombinant BGL1 by linamarin. Activity is plotted versus incubation time in 
5 the presence of the indicated concentrations of linamarin. 

FIG. 10 demonstrates the stability of recombinant A. niger p- 
glucosidase at various temperatures. Activity is calculated as percent of a 
recombinant enzyme solution kept at 4 °C. 

FIGs. lla-c show schematic depictions of expression cassettes used for 
io expression of A. niger p-glucosidase in tobacco plants. Figure 1 la - a cassette 
encoding BGL1 without a signal peptide (see, SEQ ID NO: 13 for the 
nucleotide sequence and SEQ ID NO: 14 for the amino acid sequence); Figure 
1 lb - a cassette encoding a BGL1 fused to a Cell signal peptide for secretion 
into the apoplast (see, SEQ ID NO: 15 for the nucleotide sequence and SEQ ID 
15 NO: 16 for the amino acid sequence); and Figure 11c - a cassette encoding a 
BGL1 fused to Cell signal peptide as in Figure lib and in addition to HDEL 
(SEQ ID NO: 17) ER-retaining peptide at the C-terminus for accumulation in 
the ER (see, SEQ ID NO: 18 for the nucleotide sequence and SEQ ID NO: 19 
for the amino acid sequence). 
20 FIG. 12 demonstrate PCR amplification results of bgll cDNA 

indicating the presence of bgll cDNA in transgenic plants. CB10 and CB1 1 - 
transgenic plants transformed with bgll and Cell signal peptide without 
HDEL ER retaining peptide. CBT3, CBT8 and CBT15 - different transgenic 
lines transformed with bgll, Cell signal peptide and HDEL. Bl - a transgenic 
25 plants transformed with bgll. lkb - 1 kb DNA marker. WT - wild type non 
transgenic plant. pETBl - bgll plasmid DNA. 

FIGs. 13a-b show Western blot analyses of transgenic plants containing 
BGL1 without signal peptide (13a), and BGL1 with Cell signal peptide (13b), 
with and without HDEL ER retaining peptide. An gluco - purified A. niger 
beta-glucosidase. WT - nontransgenic control plant. Bl, B15, B16, B20, B27, 
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B33 and B34 - different transgenic lines transformed with bgll. CBT1, CBT 
3, CBT 7 and CBT 8 - different transgenic lines transformed with bgll, Cell 
signal peptide and HDEL. CB10 and CB12 - transgenic plants transformed 
with bgll and Cell signal peptide without HDEL ER retaining peptide. 

FIG. 14 show activity gel analysis of transgenic tobacco plant extracts 
in SDS-PAGE incubated with MUGlu. WT - non-transgenic control plant. 
CB10 and CB1 1 - two independent lines of transgenic plants expressing BGL1 
fused to Cell signal peptide (without HDEL). CBT3, CBT8 and CBT15 - 

5 independent lines of transgenic plants expressing BGL1 fused to Cell signal 
peptide at the N terminus and HDEL ER retaining peptide at the C terminus. 
Bl and B34 - transgenic plant expressing BGL1 without signal peptide or 
HDEL ER retaining peptide and which were positive for BGL1 protein in 
Western blot analysis. An Glu - control A. niger native beta-glucosidase. 

10 FIG. 15 demonstrates level of BGL1 activity in different transgenic 

plants. WT - non-transgenic control plant. Bl and B21 - transgenic plants 
expressing BGL1 without signal peptide or HDEL ER retaining peptide and 
which were positive for BGL1 in Western blot analysis. CBT8, CBT2 1 , CBTO 
and CBT15 - independent lines of transgenic plants expressing BGL1 fused to 

15 Cell signal peptide at the N terminus and HDEL ER retaining peptide at the C 
terminus. CB12, CB13, CB14 and CB15 - four independent lines of 
transgenic plants expressing BGL1 fused to Cell signal peptide (without 
HDEL). 

20 DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention is of a polypeptide having p-glucosidase 
enzymatic activity, a polynucleotide encoding the polypeptide, a nucleic acid 
constructs carrying the polynucleotide, transformed or infected cells, such as 
yeast cells, and organisms expressing the polynucleotide and various uses of 

25 the polypeptide, the polynucleotide, cells and/or organisms, including, but not 
limited to, producing a recombinant polypeptide having the (5-glucosidase 
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enzymatic activity, increasing the level of aroma compounds in alcoholic 
beverages, as well as other fermentation products of plant material, 
hydrolyzing cellobiose and thus increasing the level of fermentable glucose, 
increasing the production of alcohol, such as ethanol from plant material, 
5 increasing the aroma released from a plant or a plant product, and hydrolysis or 
transglycosylation of glycosides. 

The principles and operation of the present invention may be better 
understood with reference to the drawings and accompanying descriptions. 

Before explaining at least one embodiment of the invention in detail, it 
10 is to be understood that the invention is not limited in its application to the 
details of the components set forth in the following description or exemplified 
in the examples that follow. The invention is capable of other embodiments or 
of being practiced or carried out in various ways. Also, it is to be understood 
that the phraseology and terminology employed herein is for the purpose of 
15 description and should not be regarded as limiting. 

According to one aspect of the present invention there is provided an 
isolated nucleic acid comprising a genomic, complementary or composite 
polynucleotide encoding a polypeptide having a (i-glucosidase catalytic 
activity. Preferably the polynucleotide is derived from Aspergillus niger, 
20 however other sources are applicable. These include all isolated 
polynucleotides encoding polypeptide having P-glucosidase catalytic activity. 
Such polynucleotides and polypeptides identified by their GenBank Accession 
Nos. are listed in Table 1 below, all of which can be used while implementing 
the present invention. 

25 TABLE 1 

Accession numbers of cDNA and their 
encoded beta-glucosidases (EC.3.2.1.21) 



Organism 


SWISS-PROT 


EMBL 


Acetobacter xylinus 


024749 


AB003689; AB010645 


Agrobacterium sp. 


P12614 


M19033; AAA22085.1 


Agrobacterium tumefaciens 


P27034 


M59852; AAA22082.1 
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Arabidopsis t ha liana 


082772, 024433, 023656 


AF082157; AF082158; 
AC009327; U72153; U72155 
AC020665; AC066691 


Aspergillus aculeatus 


P48825 


D64088, BAA 10968.1 


Aspergillus kawachi 


P87076 


AB003470 


Aspergillus niger B 1 




AJ 132386; CAB75696.1 


Aspergillus niger A MSI 


Q9P456 


AF268911 


A vena sativa 


Q38786, Q9ZP27 


X78433; AF082991 


Azospirillum irakense 




AF090429; AAF21798.1 


Bacillus circulans 


Q03506 


M96979; AAA22266.1 


Bacillus sp. GLl 


Q9ZNN7 


AB009411;BAA36161.1; 
AB009410 


Bacillus polymyxa 


P22073, P22505 


M60210; M60211 


Bacillus subtil is 


P40740 


Z34526; CAA84287.1 


Bacillus subtilis 


P42403 


D30762; BAA06429.1 


Bacteroides fragilis 


031356 


AF006658; AAB62870. 1 


Bijidooacterium breve 


P94248, 008487 


D84489; D8831 1 


Botryoiima fuckeliana 




AJ130890;CAB61489.1 


Brassica napus 


Q42618 


X82577 


Brassica nigra 


024434 


U72154 


Butyrivibno fibnsolvens 


PI 6084 


M31120; AAA23008.1 


Caldocellum saccharolyticum 


P10482 


X12575; CAA31087.1 


Caldicellulosiruptor sp. 14B 


Q9ZEN0 


AJ131346 


Candida wickerhamii 


Q12601 


U 13672 


Cavia porcellus 


P97265 


U50545 


Celluiomonas biazotea 


051843 


AF005277; AAC38196.1 


Cellulomonas fimi 


Q46043 


M94865 


Cellvibrio gilvus 


P96316 


D14068;BAA03 152.1 


Chryseobacterium 
m eningosepticum 


030713 


AF015915 


Clostridium stercorarium 


008331 


Z94045 


Clostridium thermocellum 


P26208 


X60268;CAA42814.1 


Clostridium thermocellum 


PI 4002 


XI 5644; CAA33665.1 


Coccidioides immitis 


O 14424 


U87805; AF022893 


Costus speciosus 


Q42707 


D83177 


Dalbergia cochinchinensis 


Q9SPK3 


AF163097 


Dictyostelium discoideum 


Q23892 


L21014 


Digitalis lanata 


Q9ZPB6 


AJ133406 


Erwinia chrysanthemi 


Q46684 


U08606; AAA80156.1 


Lrwinia nerbicola 


Q59437 


X79911;CAA56282.1 


Escherichia colt 


P33363 


U15049; AAB38487.1 


Escherichia coll K 12/ MG 1 655 


E65074, Q46829 


U28375; AE000373 


Glycine max 




AF000378; AAD09291.1 


JJ 1 f 

Hansenula anomala 


P06835 


X02903;CAA26662.1 


Homo sapiens 




AJ278964; CAC08 178.1 


Hordeum vulgare 


Q40025 


L41869 


Humicola grisea var.thermoidea 


093784 


AB003109 


Kluyveromyces marxianus 


P07337 


X05918;CAA29353.1 


Lactobacillus pi ant arum 


086291 


Y15954; AJ250202; CAB71 149.1 


Manihot esculent a 


Q40283 


X94986 


Microbispora bispora 


P38645 


M97265; AAA25311.1 


Nimtinnn tfihtnrum 

I t ftl/ilU'iU lUCUL Uffl 




ABU 1 7502; BAA33065 . 1 


Orpinomyces sp. PC-2 




AF016864; AAD45834.1 


Oryza sativa 


Q42975 


U28047 
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Paenibacillus polymyxa 


P220.73 


M60210; AAA22263.1 


Paenibacillus polymyxa 


P22505 


M60211; AAA22264.1 


Phaeosphaeria avenaria 




AJ276675; CAB82861.1 


Phanerochaete chrysosporium 


074203 


AF036872; AF036873 


Pichia anomala (Candida 
pelliculosa) 


P06835 


X02903 


Pinus contort a 




AF072736; AAC696.1 


Polygonum tinctorium 




AB003089;BAA78708.1 


Primus avium 


Q43014 


U39228 


Prunus serotina 


Q43073, Q40984 


U50201; U26025 


Prevotella albensis M384 




AJ276021;CAC07 184.1 


Prevotella ruminicola 


Q59716 


U35425 


Pyrococcus furiosus 


Q51723 


AF013169; U37557 


Ruminococcus albus 


P15885 066050 


XI 54 15; CAA33461.1 U92808 


Saccharomycopsis fibuligera 


P22506 


M22475; AAA34314.1 


Saccharomycopsis fibuligera 


P22507 


M22476; AAA34315.1 


Saccharopolyspora erythraea 


070021 


Y14327 


Salmonella typhimurium 


Q56078 


D86507; BAA13 102.1 


Schizophyllum commune 


P29091 


M27313; AAA33925.1 


Schizosaccharomyces pom be 




AL355920; CAB91 163.1 


Secale cereale 




AF293849; AAG00614.1 


Septoria lycopersici 


Q99324 


U24701; U35462 


Sorghum bicolor 


Q41290 


U33817 


Spodoptera frugiperda 


061594 


AF052729 


Streptomyces coelicolor A3 (2) 




AL121596; CAB56653.1 


Streptomyces reticuli 


Q9X9R4 


AJ009797 


Streptomyces rochei A2 


Q55000 


X74291 


Streptomyces sp. QM-B814 


Q59976 


Z29625 


Thermoanaerobacter brockii 


P96090, Q60026 


Z56279; Z56279 


Thermobifida fusca ER1 




AF086819; AAF37727.1 


Thermococcus sp. 


008324 


Z70242 


Thermotoga mar it i ma 


Q08638 


X74163;CAA52276.1 


Thermotoga neapolitana 


033843, Q60038 


Z97212; Z77856; CAB10165.1 


Therm us sp. Z-l 


Q9RA58 


AB034947 


Thermus thermophilus 


Q9X9D4 


Y 16753 


Trichoderma reesei (Hypocrea 
Jecorina ) 


Q12715, 
093785 


U09580; AAA 18473.1, 
AB003110 


Trifolium repens 


P26204 


X56734;CAA40058.1 


Trifolium repens 


P26205 


X56733;CAA40057.1 


Tropaeolum majus 


082074 


AJ006501; CAA07070.1 


Zea mays 


P49235, Q41761 


X74217, U25157; CAA52293.1 
U33816, U44087, U44773 


Unidentified bacterium 


Q60055 


U12011 



As used herein in the specification and in the claims section that 
follows, the term "isolated" refers to a biological component (such as a nucleic 
acid or protein or organelle) that has been substantially separated or purified 
5 away from other biological components in the cell of the organism in which 
the component naturally occurs, i.e., other chromosomal and extra- 
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chromosomal DNA and RNA, proteins and organelles. Nucleic acids and 
proteins that have been "isolated" include nucleic acids and proteins purified 
by standard purification methods. The term also embraces nucleic acids and 
proteins prepared by recombinant expression in a host cell as well as 

5 chemically synthesized nucleic acids. 

As used herein and in the claims section that follows the terms and 
phrases "polynucleotide" and "polynucleotide sequence" are used 
interchangeably and refer to a nucleotide sequence which can be DNA or RNA 
of, for example, genomic or synthetic origin, which may be single- or double- 

10 stranded, and which may represent the sense or antisense strand. Similarly, the 
terms "polypeptide" and "polypeptide sequence" are interchangeably used 
herein and refer to an amino acid sequence of any length. 

As used herein in the specification and in the claims section that 
follows, the phrase "complementary polynucleotide sequence" includes 

15 sequences, which originally result from reverse transcription of messenger 
RNA using a reverse transcriptase or any other RNA dependent DNA 
polymerase. Such sequences can be subsequently amplified in vivo or in vitro 
using a DNA dependent DNA polymerase. 

As used herein in the specification and in the claims section that 

20 follows, the phrase "genomic polynucleotide sequence" includes sequences 
which originally derive from a chromosome and reflect a contiguous portion of 
a chromosome. 

As used herein in the specification and in the claims section that 
follows, the phrase "composite polynucleotide sequence" includes sequences 
25 which are at least partially complementary and at least partially genomic. A 
composite sequence can include some exonal sequences required to encode the 
polypeptide having the (J-glucosidase catalytic activity, as well as some 
intronic sequences interposing therebetween. The intronic sequences can be of 
any source, including of other genes, and typically will include conserved 
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splicing signal sequences. Such intronic sequences may further include cis 
acting expression regulatory elements, as hereinbelow described. 

As used herein in the specification and in the claims section that 
follows, the phrase "having a p-glucosidase catalytic activity" refers to a 

5 polypeptide sequence, protein or fragments thereof capable of serving as 
catalysts to a chemical reaction involving hydrolysis of the O-glycosidic bond 
of glucosides, the result of which is the release of a P-D-glucose residue(s), or 
an aglycon, in addition to the P-D-glucose residue. Specifically, hydrolysis by 
retaining enzymes is performed while maintaining the p-configuration of the 

10 anomeric center of the carbohydrate. A wide specificity for p-glucosides 
exists, thus, some examples also hydrolyze one or more of the following: P-D- 
galactosides, a-L- arabinosides, p-D-xylosides, and p-D-fucosides. 

As used herein the term "catalyst" refers to a substance that accelerates 
a chemical reaction, but is not consumed or changed permanently thereby. 

15 As used herein the term "glucoside" refers to a compound of at least 

two monomers, at least one of which is a glucose, including a glycoside bond. 
Examples of glucosides include, but are not limited to, glucose containing 
backbones, such as the diglucose cellobiose, and the glucose polymer, 
cellulose. 

20 According to preferred embodiments, the polynucleotide according to 

this aspect of the present invention encodes a polypeptide as set forth in SEQ 
ID NO:2 or a portion thereof which retains p-glucosidase catalytic activity. 

Alternatively or additionally, the polynucleotide according to this aspect 
of the present invention is as set forth in SEQ ID NO: 1, 3 or a portion thereof, 

25 the portion encodes a polypeptide retaining P-glucosidase catalytic activity. 

In a broader aspect the polynucleotides according to the present 
invention encode a polypeptide which is at least 75 %, at least 80 %, at least 85 
%, at least 90 %, at least 95 % or more, say 95 % - 100 % homologous to SEQ 
ID NO:2 as determined using the BestFit software of the Wisconsin sequence 
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creation penalty equals 8 and gap extension penalty equals 2. 

According to preferred embodiments, the polynucleotides according to 
the broader aspect of the present invention encodes a polypeptide as set forth 
5 in SEQ ID NOs: 1 or 3 or a portion thereof which retains activity. 

Alternatively or additionally, the polynucleotides according to this 
broader aspect of the present invention are hybridizable with SEQ ID NOs: 1 
or 3. 

Hybridization for long nucleic acids (e.g., above 200 bp in length) is 

10 effected according to preferred embodiments of the present invention by 
stringent or moderate hybridization, wherein stringent hybridization is effected 
by a hybridization solution containing 10 % dextrane sulfate, 1 M NaCl, 1 % 
SDS and 5 x 10^ cpm 32 p labeled probe, at 65 °C, with a final wash solution 
of 0.2 x SSC and 0.1 % SDS and final wash at 65°C; whereas moderate 

15 hybridization is effected by a hybridization solution containing 10 % dextrane 
sulfate, 1 M NaCl, 1 % SDS and 5 x 10 6 cpm 32 p labeled probe, at 65 °C, 
with a final wash solution of 1 x SSC and 0.1 % SDS and final wash at 50 °C. 

Yet alternatively or additionally, the polynucleotides according to this 
broad aspect of the present invention is preferably at least 70 %, at least 75 %, 

20 at least 80 %, at least 85 %, at least 90 %, at least 95 % or more, say 95 % - 
100 %, identical with SEQ ID NOs: 1 or 3 as determined using the BestFit 
software of the Wisconsin sequence analysis package, utilizing the Smith and 
Waterman algorithm, where gap weight equals 50, length weight equals 3, 
average match equals 10 and average mismatch equals -9. 

25 Thus, this broad aspect of the present invention encompasses (i) 

polynucleotides as set forth in SEQ ID NOs:l or 3; (ii) fragments thereof; (iii) 
sequences hybridizable therewith; (iv) sequences homologous thereto; (v) 
sequences encoding similar polypeptides with different codon usage; (vi) 
altered sequences characterized by mutations, such as deletion, insertion or 
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substitution of one or more nucleotides, either naturally occurring or man 
induced, either randomly or in a targeted fashion. 

According to another aspect of the present invention there is provided a 
nucleic acid construct comprising the isolated nucleic acid described herein. 

5 According to a preferred embodiment, the nucleic acid construct 

according to this aspect of the present invention further comprising at least one 
cis acting control (regulatory) element for regulating the expression of the 
isolated nucleic acid. Such cis acting regulatory elements include, for 
example, promoters, which are known to be sequence elements required for 

10 transcription, as they serve to bind DNA dependent RNA polymerase, which 
transcribes sequences present downstream thereof. Further details relating to 
various regulatory elements are described hereinbelow. 

While the isolated nucleic acid described herein is an essential element 
of the invention, it is modular and can be used in different contexts. The 

15 promoter of choice that is used in conjunction with this invention is of 
secondary importance, and will comprise any suitable promoter. It will be 
appreciated by one skilled in the art, however, that it is necessary to make sure 
that the transcription start site(s) will be located upstream of an open reading 
frame. In a preferred embodiment of the present invention, the promoter that 

20 is selected comprises an element that is active in the particular host cells of 
interest. These elements may be selected from transcriptional regulators that 
activate the transcription of genes essential for the survival of these cells in 
conditions of stress or starvation, including the heat shock proteins. 

A construct according to the present invention preferably further 

25 includes an appropriate selectable marker. In a more preferred embodiment 
according to the present invention the construct further includes an origin of 
replication. In another most preferred embodiment according to the present 
invention the construct is a shuttle vector, which can propagate both in E. coli 
(wherein the construct comprises an appropriate selectable marker and origin 

30 of replication) and be compatible for propagation in cells, or integration in the 
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genome, of an organism of choice, such as a plant. The construct according to 

this aspect of the present invention can be, for example, a plasmid, a bacmid, a 

phagemid, a cosmid, a phage, a virus or an artificial chromosome. 

According to an additional aspect of the present invention there is 
5 provided a recombinant protein comprising a polypeptide having a P- 

glucosidase catalytic activity. The polypeptide is preferably derived from an 

Aspergillus niger and preferably includes a signal peptide and optionally an 

endoplasmic reticulum retaining peptide. 

According to preferred embodiments, the polypeptide according to this 
10 aspect of the present invention is as set forth in SEQ ID NO:2 or a portion 

thereof which retains P-glucosidase catalytic activity. 

SEQ ID NO:2 of A. niger P-glucosidase is similar to the amino acid 

sequence of the P-glucosidase of A. kawachii. However, while the former is 

highly stable at wide range of temperatures and pH treatments, the latter is 
15 relatively unstable, and thus has certain disadvantages, rendering its use for the 

purpose of the present invention as is further detailed and described 

hereinunder, unfeasible and/or much less attractive. 

Recently, Iwashita and coworkers have published the sequence of a P- 

glucosidase (GenBank/EMBL AB003470) obtained from Aspergillus kawachii 
20 strain: IFO4308. Sequence comparison between Aspergillus kawachii P~ 

glucosidase and A. niger p-glucosidase revealed that the two share 98 % 

homology. 

Enzymes of the two Aspergillus sp. contain seven cysteine residues and 
identical number of glycosylation sites, while differing in their degree of 
25 glycosylation (35). 

The physical and kinetic properties of three p-glucosidases from 
Aspergillus kawachii were described (35), and the three were shown to be 
products of the same gene, differing solely by the degree of glycosylation. The 
three purified A. kawachii p-glucosidases were readily inactivated, even at 
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moderate pH and temperature conditions. In sharp distinction, while 
examining the stability of the recombinant A. niger p-glucosidase according to 
the present invention under conditions identical to those described by Iwashita 
et al. and as described hereinbelow in the Examples section, revealed that the 

5 enzyme is highly stable, retaining majority of the enzymatic activity even after 
1 hour incubation at 60 °C (68 % activity, as defined by percent activity of an 
enzyme kept at 4 °C). 

Thus, despite the similarity between the A. kawachii and A, niger (5- 
glucosidases, the A. niger enzyme unexpectedly exhibits significantly higher 

io thermal and pH stability. 

According to yet another aspect of the present invention there is 
provided a host cell comprising a nucleic acid construct as described herein. 
The term "host cell" refers to a recipient of a heterologous nucleic acid, which 
host cell can be either a prokaryotic cell, such as E. coli, or a eukaryotic cell, 

15 such as a yeast cell, a filamentous fungus cell, a plant cell or an animal cell. 
Examples for a yeast cell include, but not limited to, Pichia sp. such as P. 
pastor is, and Saccharomyces sp. such as S. cervisiae. 

As used herein and in the claims section which follows, the term 
"heterologous' 1 when used in context of a nucleic acid sequence or a protein 

20 found within a plant, plant derived tissue or plant cells, or alternatively, within 
a eukaryotic cell, such as yeast, or a prokaryotic cell such as bacteria, refers to 
nucleic acid or amino acid sequences typically not native to the plant, plant 
derived tissue or plant cells, or alternatively, to the eukaryotic cell, such as 
yeast, or the prokaryotic cell, such as bacteria. Interchangeably, nucleic acid 

25 or amino acid sequences typically not native to the plant, plant derived tissue 
or plant cells, or alternatively, to the eukaryotic cell, such as yeast, or the 
prokaryotic cell, such as bacteria, are referred to by "recombinant nucleic acid" 
and "recombinant protein", respectively. Thus, a recombinant nucleic acid is 
one that has a sequence that is not naturally occurring or has a sequence that is 

30 made by an artificial combination of two otherwise separated segments of 
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sequence. This artificial combination is often accomplished by chemical 
synthesis or, more commonly, by the artificial manipulation of isolated 
segments of nucleic acids, e.g., by genetic engineering techniques. 

As used herein in the specification and in the claims section that 

5 follows, the term "eukaryotic cell" refers to a cell containing a diploid genome 
through at least a portion of its life cycle, having membrane-bound nucleus 
with chromosomes made of DNA, with cell division involving a form of 
mitosis in which spindles are involved. Possession of a eukaryote type of cell 
characterizes the four kingdoms, Protoctista, Fungi, Plantae and Animalia. 

10 As used herein in the specification and in the claims section that 

follows, the term "prokaryotic cell" refers to various bacteria and blue-green 
algae, characterized by the absence of the nuclear organization, mitotic 
capacities and complex organelles that typify the eukaryote superkingdom. 
Examples of prokaryotic cell according to the present invention are bacteria, 

15 such as, but not limited to, E. coli. 

According to still another aspect of the present invention there is 
provided an organism comprising a nucleic acid construct as described herein, 
such as, but not limited to, a plant. Such an organism is said to be transformed 
or virally infected. 

20 As used herein the term "transformed" and its conjugations such as 

transformation, transforming and transform, all relate to the process of 
introducing heterologous nucleic acid sequences into a cell or an organism, 
which nucleic acid sequences are propagatable to the offspring. The term thus 
reads on, for example, "genetically modified", "transgenic" and "transfected", 

25 which may be used herein to further describe and/or claim the present 
invention. The term relates both to introduction of a heterologous nucleic acid 
sequence into the genome of an organism and/or into the genome of a nucleic 
acid containing organelle thereof, such as into a genome of chloroplast or a 
mitochondrion. 



BNSOOCID: <WO ,0136586A2_I_> 



WO 0-1/36586,. PCT/ILOO/00758 

24 

As used herein the phrase "viral infected" includes infection by a virus 
carrying a heterologous nucleic acid sequence. Such infection typically results 
in transient expression of the nucleic acid sequence, which nucleic acid 
sequence is typically not integrated into a genome and therefore not 
5 propagatable to offspring, unless further infection of such offspring is 
experienced. 

There are various methods of introducing foreign genes into both 
monocotyledonous and dicotyledenous plants (Potrykus, I., Annu. Rev. Plant. 
Physiol., Plant. Mol. Biol. (1991) 42:205-225; Shimamoto et al, Nature 
10 (1989) 338:274-276). The principle methods of causing stable integration of 
exogenous DNA into plant genomic DNA include two main approaches: 

(i) Agrobacterium-mcdi&ted gene transfer: Klee et al (1987) Annu. 
Rev. Plant Physiol. 38:467-486; Klee and Rogers in Cell Culture and 
Somatic Cell Genetics of Plants, Vol. 6, Molecular Biology of Plant Nuclear 

15 Genes, eds. Schell, J., and Vasil, L. K., Academic Publishers, San Diego, 
Calif. (1989) p. 2-25; Gatenby, in Plant Biotechnology, eds. Kung, S. and 
Arntzen, C. J, Butterworth Publishers, Boston, Mass. (1989) p. 93-112. 

(ii) direct DNA uptake: Paszkowski et al, in Cell Culture and Somatic 
Cell Genetics of Plants, Vol. 6, Molecular Biology of Plant Nuclear Genes 

20 eds. Schell, J., and Vasil, L. K., Academic Publishers, San Diego, Calif, 
(1989) p. 52-68; including methods for direct uptake of DNA into protoplasts, 
Toriyama, K. et al (1988) Bio/Technology 6:1072-1074. DNA uptake 
induced by brief electric shock of plant cells: Zhang et al Plant Cell Rep. 
(1988) 7:379-384. Fromm et al Nature (1986) 319:791-793. DNA injection 

25 into plant cells or tissues by particle bombardment, Klein et al 
Bio/Technology (1988) 6:559-563; McCabe et al Bio/Technology (1988) 
6:923-926; Sanford, Physiol. Plant. (1990) 79:206-209; by the use of 
micropipette systems: Neuhaus et al, Theor. Appl. Genet. (1987) 75:30-36; 
Neuhaus and Spangenberg, Physiol. Plant. (1990) 79:213-217; or by the direct 

30 incubation of DNA with germinating pollen, DeWet et al in Experimental 
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Manipulation of Ovule Tissue, eds. Chapman, G. P. and Mantcll, S. H. and 
Daniels, W. Longman, London, (1985) p. 197-209; and Ohta, Proc. Natl. 
Acad. Sci. USA (1986) 83:715-719. 

The Agrobacterium system includes the use of plasmid vectors that 

5 contain defined DNA segments that integrate into the plant genomic DNA. 
Methods of inoculation of the. plant tissue vary depending upon the plant 
species and the Agrobacterium delivery system. A widely used approach is the 
leaf disc procedure, which can be performed with any tissue explant that 
provides a good source for initiation of whole plant differentiation. Horsch et 

10 al in Plant Molecular Biology Manual A5, Kluwer Academic Publishers, 
Dordrecht (1988) p. 1-9. A supplementary approach employs the 
Agrobacterium delivery system in combination with vacuum infiltration. The 
Agrobacterium system is especially viable in the creation of transgenic 
dicotyledenous plants. 

15 There are various methods of direct DNA transfer into plant cells. In 

electroporation, the protoplasts are briefly exposed to a strong electric field. In 
microinjection, the DNA is mechanically injected directly into the cells using 
very small micropipettes. In microparticle bombardment, the DNA is adsorbed 
on microprojectiles such as magnesium sulfate crystals or tungsten particles, 

20 and the microprojectiles are physically accelerated into cells or plant tissues. 

Following transformation plant propagation is exercised. The most 
common method of plant propagation is by seed. Regeneration by seed 
propagation, however, has the deficiency that due to heterozygosity there is a 
lack of uniformity in the crop, since seeds are produced by plants according to 

25 the genetic variances governed by Mendelian rules. Basically, each seed is 
genetically different and each will grow with its own specific traits. 
Therefore, it is preferred that the transformed plant be produced such that the 
regenerated plant has the identical traits and characteristics of the parent 
transgenic plant. Therefore, it is preferred that the transformed plant be 
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regenerated by micropropagation, which provides a rapid, consistent 
reproduction of the transformed plants. 

Micropropagation is a process of growing new generation plants from a 
single piece of tissue that has been excised from a selected parent plant or 

5 cultivar. This process permits the mass reproduction of plants having the 
preferred tissue expressing the protein. The new generation plants, which are 
produced, are genetically identical to, and have all of the characteristics of, the 
original plant. Micropropagation allows mass production of quality plant 
material in a short period of time and offers a rapid multiplication of selected 

10 cultivars in the preservation of the characteristics of the original transgenic or 
transformed plant. The advantages of cloning plants are the speed of plant 
multiplication and the quality and uniformity of plants produced. 

Micropropagation is a multi-stage procedure that requires alteration of 
culture medium or growth conditions between stages. Thus, the 

15 micropropagation process involves four basic stages: Stage one, initial tissue 
culturing; stage two, tissue culture multiplication; stage three, differentiation 
and plant formation; and stage four, greenhouse culturing and hardening. 
During stage one, initial tissue culturing, the tissue culture is established and 
certified contaminant-free. During stage two, the initial tissue culture is 

20 multiplied until a sufficient number of tissue samples are produced to meet 
production goals. During stage three, the tissue samples grown in stage two 
are divided and grown into individual plantlets. At stage four, the transformed 
plantlets are transferred to a greenhouse for hardening where the plants 1 
tolerance to light is gradually increased so that it can be grown in the natural 

25 environment. 

Sequences suitable for permitting integration of the heterologous 
sequence into the plant genome are recommended. These might include 
transposon sequences and the like for homologous recombination as well as Ti 
sequences which permit random insertion of a heterologous expression 
30 cassette into a plant genome. 
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Suitable prokaryote selectable markers include resistance toward 
antibiotics such as ampicillin or tetracycline. Other DNA sequences encoding 
additional functions may also be present in the vector, as is known in the art. 

The constructs of the subject invention will include an expression 
cassette for expression of the protein of interest. Usually, there will be only 
one expression cassette, although two or more are feasible. The recombinant 
expression cassette will contain in addition to the heterologous sequence one 
or more of the following sequence elements, a promoter region, plant 5' 
untranslated sequences which can include regulatory elements, initiation codon 
depending upon whether or not the structural gene comes equipped with one, 
and a transcription and translation termination sequence. Unique restriction 
enzyme sites at the 5' and 3* ends of the cassette allow for easy insertion into a 
pre-existing vector. 

As used herein, the phrase "regulatory element" refers to a nucleotide 
15 sequence which are typically included within an expression cassette and 
function in regulating (i.e., enhancing or depressing) the expression of a 
coding sequence therefrom. This regulation can be effected either at the 
transcription or the translation stages. Examples of regulatory elements 
include, but are not limited to, enhancers, suppressers and transcription 
20 terminators. 

As used herein the term "promoter" refers to a nucleotide sequence, 
which can direct gene expression in cells. Such a promoter can be derived 
from a plant, a plant virus, or from any other living organism including 
bacteria and animals. 

25 A plant promoter can be a constitutive promoter, such as, but not 

limited to, CaMV35S and CaMV19S promoters, FMV34S promoter, 
sugarcane bacilliform badnavirus promoter, CsVMV promoter, Arabidopsis 
ACT2/ACT8 actin promoter, Arabidopsis ubiquitin UBQ1 promoter, barley 
leaf thionin BTH6 promoter, and rice actin promoter. 
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The promoter can alternatively be a tissue specific promoter. 
Examples of plant tissue specific promoters include, without being limited to, 
bean phaseolin storage protein promoter, DLEC promoter, PHSp promoter, 
zein stprotein promoter, conglutin gamma promoter from soybean, AT2S1 
5 gene promoter, ACT1 1 actin promoter from Arabidopsis, napA promoter from 
Brassica napus, potato patatin gene promoter and the Tob promoter. 

The promoter may also be a promoter which is active in a specific 
developmental stage of a plant's life cycle, for example, a promoter active in 
late embryogenesis, such as: the LEA promoter; Endosperm-specific 
10 expression promoter (the seed storage prolamin from rice is expressed in 
tobacco seed at the developmental stage about 20 days after flowering) or the 
promoter controlling the FbL2A gene during fiber wall synthesis stages. 

In case of a tissue-specific promoter, it ensures that the heterologous 
protein is expressed only in the desired tissue, for example, only in the flower, 
15 the fruit, the root, the seed, etc. 

Both the tissue-specific and the non-specific promoters may be 
constitutive, i.e., may cause continuous expression of the heterologous protein. 

The promoter may also be an inducible promoter, i.e., a promoter which 
is activated by the presence of an inducing agent, and only upon said 
20 activation, causes expression of the heterologous protein. An inducing agent 
can be for example, light, chemicals, drought, high salinity, osmotic shock, 
oxidant conditions or in case of pathogenicity and include, without being 
limited to, the light-inducible promoter derived from the pea rbcS gene, the 
promoter from the alfalfa rbcS gene, the promoters DRE, MYC and MYB 
25 active in drought; the promoters INT, INPS, prxEa, Ha hspl7.7G4 and RD21 
active in high salinity and osmotic stress, the promoters hsr303J and str246C 
active in pathogenic stress, the copper-controllable gene expression system and 
the steroid-inducible gene system 

Alternatively, an inducing agent may be an endogenous agent which is 
30 normally present in only certain tissues of the plant, or is produced only at 
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certain time periods of the plant's life cycle, such as ethylene or steroids. By 
using such an endogenous tissue-specific inducing agent, it is possible to 
control the expression from such inducible promoters only in those specific 
tissues. By using an inducing agent produced only during a specific period of 
the life cycle, it is possible to control the expression from an inducible 
promoter to the specific phase in the life-cycle in which the inducing agent is 
produced. 

Bacterial and yeast derived promoters are well known in the art. 
Viruses are a unique class of infectious agents whose distinctive 
features are their simple organization and their mechanism of replication. In 
fact, a complete viral particle, or virion, may be regarded mainly as a block of 
genetic material (either DNA or RNA) capable of autonomous replication, 
surrounded by a protein coat and sometimes by an additional membranous 
envelope such as in the case of alpha viruses. The coat protects the virus from 
15 the environment and serves as a vehicle for transmission from one host cell to 
another. 

Viruses that have been shown to be useful for the transformation of 
plant hosts include CaV, TMV and BV. Transformation of plants using plant 
viruses is described in U.S. Pat. No. 4,855,237 (BGV), EP-A 67,553 (TMV), 

20 Japanese Published Application No. 63-14693 (TMV), EPA 194,809 (BV), 
EPA 278,667 (BV); and Gluzman, Y. et al, Communications in Molecular 
Biology: Viral Vectors, Cold Spring Harbor Laboratory, New York, pp. 172- 
189 (1988). Pseudovirus particles for use in expressing foreign DNA in many 
hosts, including plants, is described in WO 87/06261 . 

25 Construction of plant RNA viruses for the introduction and expression 

of non-viral foreign genes in plants is demonstrated by the above references as 
well as by Dawson, W. O. et al., Virology (1989) 172:285-292; Takamatsw et 
al. EMBO J. (1987) 6:307-311; French et al. Science (1986) 231:1294-1297; 
and Takamatsw et al. FEBS Letters (1990) 269:73-76. 
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When the virus is a DNA virus, the constructions can be made to the 
virus itself. Alternatively, the virus can first be cloned into a bacterial plasmid 
for ease of constructing the desired viral vector with the foreign DNA. The 
virus n then be excised from the plasmid. If the virus is a DNA virus, a 

5 bacterial origin of replication can be attached to the viral DNA, which is then 
replicated by the bacteria. Transcription and translation of this DNA will 
produce the coat protein which will encapsidate the viral DNA. If the virus is 
an RNA virus, the virus is generally cloned as a cDNA and inserted into a 
plasmid. The plasmid is then used to make all of the constructions. The RNA 

10 virus is then produced by transcribing the viral sequence of the plasmid and 
translation of the viral genes to produce the coat protein(s) which encapsidate 
the viral RNA. 

Construction of plant RNA viruses for the introduction and expression 
of non-viral foreign genes in plants is demonstrated by the above references as 

15 well as in U.S. Pat. No. 5,316,931. 

In one embodiment, a plant viral nucleic acid is provided in which the 
native coat protein coding sequence has been deleted from a viral nucleic acid, 
a non-native plant viral coat protein coding sequence and a non promoter, 
preferably the subgenomic promoter of the non-native coat protein coding 

20 sequence, capable of expression in the plant host, packaging of the 
recombinant plant viral nucleic acid, and ensuring a systemic infection of the 
host by the recombinant plant viral nucleic acid, has been inserted. 
Alternatively, the coat protein gene may be inactivated by insertion of the non- 
native nucleic acid sequence within it, such that a protein is produced. The 

25 recombinant plant viral nucleic acid may contain one or more additional non- 
native subgenomic promoters. Each non-native subgenomic promoter is 
capable of transcribing or expressing adjacent genes or nucleic acid sequences 
in the plant host and incapable of recombination with each other and with 
native subgenomic promoters. Non-native (foreign) nucleic acid sequences 

30 may be inserted adjacent the native plant viral subgenomic promoter or the 
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native and a non-native plant viral subgenomic promoters if more than one 
nucleic acid sequence is included. The non-native nucleic acid sequences are 
transcribed or expressed in the host plant under control of the subgenomic 
promoter to produce the desired products. 
5 In a second embodiment, a recombinant plant viral nucleic acid is 

provided as in the first embodiment except that the native coat protein coding 
sequence is placed adjacent one of the non-native coat protein subgenomic 
promoters instead of a non-native coat protein coding sequence. 

In a third embodiment, a recombinant plant viral nucleic acid is 

10 provided in which the native coat protein gene is adjacent its subgenomic 
promoter and one or more non-native subgenomic promoters have been 
inserted into the viral nucleic acid. The inserted non-native subgenomic 
promoters are capable of transcribing or expressing adjacent genes in a plant 
host and are incapable of recombination with each other and with native 

15 subgenomic promoters. Non-native nucleic acid sequences may be inserted 
adjacent the non-native subgenomic plant viral promoters such that said 
sequences are transcribed or expressed in the host plant under control of the 
subgenomic promoters to produce the desired product. 

In a fourth embodiment, a recombinant plant viral nucleic acid is 

20 provided as in the third embodiment except that the native coat protein coding 
sequence is replaced by a non-native coat protein coding sequence. 

The viral vectors are encapsidated by the coat proteins encoded by the 
recombinant plant viral nucleic acid to produce a recombinant plant virus. The 
recombinant plant viral nucleic acid or recombinant plant virus is used to 

25 infect appropriate host plants. The recombinant plant viral nucleic acid is 
capable of replication in the host, systemic spread in the host, and transcription 
or expression of foreign gene(s) in the host to produce the desired protein. 

In many instances it is desired to target the expression of a recombinant 
protein. Such targeting can be into a cellular organelle or outside of the cell. 

30 This can be effected, as is well known in the art, by appropriate signal 



BNSDOCID: <WO 0136586A2J_> 



WO 01/36586 PCT/ILOO/00758 

32 

peptides, which are fused to the polypeptide to be targeted, typically at the N 
terminus. 

Thus, as used herein and in the claims section which follows, the phrase 
"signal peptide" refers to a stretch of amino acids which is effective in 
5 targeting a protein expressed in a cell into a target location. Different signal 
peptides, which are known in the art, are effective in secreting a protein from 
bacteria, yeast, plant and animal cells. 

It should be noted in this respect that signal peptides serve the function 
of translocation of produced protein across the endoplasmic reticulum 

io membrane. Similarly, transmembrane segments halt translocation and provide 
anchoring of the protein to the plasma membrane, see, Johnson et al. The Plant 
Cell (1990) 2:525-532; Sauer et al EMBO J. (1990) 9:3045-3050; Mueckler 
et al. Science (1985) 229:941-945. Mitochondrial, nuclear, chloroplast, or 
vacuolar signals target expressed protein correctly into the corresponding 

15 organelle through the secretory pathway, see, Von Heijne, Eur. J. Biochem. 
(1983) 133:17-21; Yon Heijne, J. Mol. Biol. (1986) 189:239-242; Iturriaga et 
al The Plant Cell (1989) 1:381-390; McKnight etal, Nucl. Acid Res. (1990) 
18:4939-4943; Matsuoka and Nakamura, Proc. Natl. Acad. Sci. USA (1991) 
88:834-838. A recent book by Cunningham and Porter (Recombinant proteins 

20 from plants, Eds. C. Cunningham and A.J.R. Porter, 1998 Humana Press 
Totowa, N.J.) describe methods for the production of recombinant proteins in 
plants and methods for targeting the proteins to different compartments in the 
plant cell. In particular, two chapters therein (14 and 15) describe different 
methods to introduce targeting sequences that results in accumulation of 

25 recombinant proteins in compartments such as ER, vacuole, plastid, nucleus 
and cytoplasm. The book by Cunningham and Porter is incorporated herein by 
reference. Presently, the preferred site of accumulation of the fusion protein 
according to the present invention is the ER using signal peptide such as Cel 1 
or the rice amylase signal peptide at the N-terminus and an ER retaining 

30 peptide (HDEL, SEQ ID NO: 17; or KDEL, SEQ ID NO:24) at the C-terminus. 
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According to an additional aspect of the present invention there is 
provided a method of producing recombinant P-glucosidase. The method 
according to this aspect of the present invention is effected by introducing, in 
an expressible or overexpressible form, a nucleic acid construct into a host 
cell. The nucleic acid construct includes a genomic, complementary or 
composite polynucleotide preferably derived from Aspergillus niger and 
encoding a polypeptide having a (5-glucosidase catalytic activity. The 
polynucleotide preferably further encodes a signal peptide in frame with the 
polypeptide. Still preferably, the polynucleotide further encodes an 
endoplasmic reticulum retaining peptide in frame with the polypeptide. 

As used herein the term "introducing" refers both to transforming and to 
virally infecting, as these terms are further defined hereinabove. As used 
herein the terms "expressible form" and "overexpressible form" refers to a 
recombinant form which includes the required regulatory elements to effect 
expression or over expression of a coding region, all as is further detailed 
hereinabove. 

According to a preferred embodiment of this aspect of the present 
invention, after sufficient expression has been detected, the polypeptide having 
the p-glucosidase catalytic activity is extracted from the expressing host cell. 

Thus host cells, expressing the polypeptide according to the present 
invention, provide an immediate, easy and indefinite source of the polypeptide. 

Any number of well-known liquid or solid culture media may be used 
for appropriately culturing host cells of the present invention, although growth 
on liquid media is preferred as the secretion of the polypeptide into the media 
results in simplification of polypeptide recovery. As is further detailed 
hereinabove, such secretion can be effected by the incorporation of a suitable 
signal peptide. The P-glucosidase may be isolated or separated or purified 
from host cell preparations using techniques well known in the art, such as, but 
not limited to, centrifugation filtration, chromatography, electrophoresis and 
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dialysis. Further concentration and/or purification of the (3-glucosidase may be 
effected by use of conventional techniques, including, but not limited to, 
ultrafiltration, further dialysis, ion-exchange chromatography, HPLC, size- 
exclusion chromatography, cellobiose-sepharose affinity chromatography, and 
5 electrophoresis, such as polyacrylamide-gel-electrophoresis (PAGE). Using 
these techniques, p-glucosidase may be recovered in pure or substantially pure 
form. 

According to an additional aspect of the present invention there is 
provided a method of increasing a level of at least one fermentation substance 

10 in a fermentation product. The method according to this aspect of the present 
invention is effected by fermenting a glucose containing fermentation starting 
material by a yeast cell overexpressing a nucleic acid construct which includes 
a genomic, complementary or composite polynucleotide preferably derived 
from Aspergillus niger and which encodes a polypeptide having a P- 

15 glucosidase catalytic activity, thereby increasing the level of the at least one 
fermentation substance in the fermentation product. The polynucleotide 
preferably further encodes a signal peptide in frame with the polypeptide. Still 
preferably, the polynucleotide further encodes an endoplasmic reticulum 
retaining peptide in frame with the polypeptide. 

20 According an alternative aspect of the present invention there is 

provided a method of increasing a level of at least one fermentation substance 
in a fermentation product. The method according to this aspect of the present 
invention is effected by fermenting a plant derived glucose containing 
fermentation starting material by a yeast cell, the plant overexpressing a 

25 nucleic acid construct which includes a genomic, complementary or composite 
polynucleotide preferably derived from Aspergillus niger and which encodes a 
polypeptide having a p-glucosidase catalytic activity, thereby increasing the 
level of the at least one fermentation substance in the fermentation product. 
The polynucleotide preferably further encodes a. signal peptide in frame with 
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the polypeptide. Still preferably, the polynucleotide further encodes an 
endoplasmic reticulum retaining peptide in frame with the polypeptide. 

As used herein in the specification and in the claims section that 
follows, the term "fermentation" refers to a chemical change induced in a 

5 complex organic compound by the action of an enzyme, whereby the substance 
is split into simpler compounds. Specifically, the term "fermentation' 1 includes 
the anaerobic dissimilation of substrates with the production of energy and 
reduced compounds, the final products thereof are organic acids, alcohols, 
such as ethanol, isopropanol, butanol, etc., and CO2. Such products, are 

10 typically secreted and each of which is referred to herein as a "fermentation 
substance 1 ', i.e., any known fermentation resultant of either microbial or yeast 
fermentation. 

As used herein in the specification and in the claims section that 
follows, the phrase "fermentation product" refers to the resultant material of a 

15 fermentation process. Examples include, but are not limited to, alcohol 
containing fermentation medium and alcoholic beverages, such a, but not 
limited to, fruit-based alcohol-containing beverages, wines and beers. 

When used in conjunction with, for example, a p-glucanase, the P- 
glucosidase is effective for hydrolyzing a variety of cellulose containing 

20 materials to glucose. The glucose produced by enzymatic hydrolysis of the 
cellulose and other glucose containing saccharides, may be recovered and 
stored, or it may be subsequently fermented to ethanol using conventional 
techniques. Many processes for the fermentation of glucose generated from 
cellulose are well known, and are suitable for use herein. Briefly, the 

25 hydrolyzate containing the glucose from the enzymatic reaction is contacted 
with an appropriate microorganism under conditions effective for the 
fermentation of the glucose to ethanol. This fermentation may be separate from 
and follow the enzymatic hydrolysis of the cellulose (sequentially processed), 
or the hydrolysis and fermentation may be concurrent and conducted in the 

30 same vessel (simultaneously processed). Details of the various fermentation 
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techniques, conditions, and suitable microorganisms have been described, for 
example, by Wyman (1994, Bioresource Technol., 50:3-16) or Olsson and 
Hahn-Hagerdal (1996, Enzyme Microbial Technol., 18:312-331), the content 
of each of which is incorporated herein by reference. Following the 
completion of a fermentation, the alcohol may be recovered by extraction, and 
optionally purified e.g., by distillation. 

Thus, according to still another aspect of the present invention there is 
provided a method of producing an alcohol. The method according to this 
aspect of the present invention is effected by fermenting a glucose containing 
fermentation starting material by a cell overexpressing a nucleic acid construct 
including a genomic, complementary or composite polynucleotide preferably 
derived from Aspergillus niger, encoding a polypeptide having a (3-glucosidase 
catalytic activity, and extracting the alcohol therefrom. The polynucleotide 
preferably further encodes a signal peptide in frame with the polypeptide. Still 
preferably, the polynucleotide further encodes an endoplasmic reticulum 
retaining peptide in frame with the polypeptide. 

According to an additional aspect of the present invention there is 
provided a method of producing an alcohol. The method according to this 
aspect of the present invention is effected by fermenting a plant derived 
glucose containing fermentation starting material by a cell, the plant 
overexpressing a nucleic acid construct including a genomic, complementary 
or composite polynucleotide preferably derived from Aspergillus niger, 
encoding a polypeptide having a (5-glucosidase catalytic activity, and 
extracting the alcohol therefrom. The polynucleotide preferably further 
encodes a signal peptide in frame with the polypeptide. Still preferably, the 
polynucleotide further encodes an endoplasmic reticulum retaining peptide in 
frame with the polypeptide. 

Plants contain aroma and flavor compounds of glycosidic nature, their 
inherent aroma property can be released by degrading enzymes, turning a non- 
volatile aroma compound into its volatile form. Thus, for example, a-L- 
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arabinofiiranosidases, assist in the liberation of aroma compounds from 
substrates such as juices or wines, as described by Gunata et al. (European 
Patent Application No. 332.281, 1989; and "purification and some properties 
of an alpha-L-arabinofuranosidase from A. niger action on grape monoterpenyl 

5 arabinofuranosylglucosides. J. Agric. Food Chem. 38: 772-776, 1990). This 
outcome is achieved, for example, in a two step process wherein the first step 
comprises the use of an a-L-arabinofuranosidase, to catalyze the release of 
arabinose residues from monoterpenyl ct-L-arabinofiiranosyl glucosides 
contained in, for example, the fruit or vegetable juice via the cleavage of the (1 

10 — »6) linkage between a terminal arabinofuranosyl unit and the intermediate 
glucose of a monoterpenyl a-L-arabinofuranosylglucoside. The a-L- 
arabinofuranosidase is preferably in a purified form so as to avoid the 
undesirable degradation of other components of the juice which may be 
detrimental to its ultimate quality. In the second step, P-glucosidase is required 

15 to yield the free terpenol from the resulting desarabinosylated monoterpenyl 
glucoside. If desired, both reaction steps may be performed in the same 
reaction vessel without the need to isolate the intermediate product (Gunata et 
al. (1989), supra). Thus, P-glucosidase is an essential contributor when the 
liberation of these aroma compounds for improving the flavor of the juice or 

20 wine is desired. Moreover, in the case of wine, the control of the liberation of 
aroma compounds provides wines with a more consistent flavor, thus reducing 
or eliminating the undesirable effect of "poor vintage years". Additional 
information is contained in: "Cloning and expression of DNA molecules 
encoding arabinan degrading enzyme of fungal origin", U.S. Pat. No. 

25 5,863,783; Y. Gueguen, et al. "A Very Efficient p-Glucosidase Catalyst for the 
Hydrolysis of Flavor Precursors of Wines and Fruit Juices", J. Agric. Food 
Chem. 44:2336-2340, 1996, each of which is incorporated herein by reference. 

Thus, according to a further aspect of the present invention there is 
provided a method of increasing a level of at least one aroma substance in a 
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plant derived product, such as, but not limited to, an alcoholic beverage. The 
method according to this aspect of the present invention is effected by 
incubating a glucose containing plant starting material with a yeast cell 
overexpressing a nucleic acid construct including a genomic, complementary 

5 or composite polynucleotide preferably derived from Aspergillus niger which 
encodes a polypeptide having a (3-glucosidase catalytic activity, thereby 
increasing the level of the at least one aroma substance in the plant derived 
product. The polynucleotide preferably further encodes a signal peptide in 
frame with the polypeptide. Still preferably, the polynucleotide further 

10 encodes an endoplasmic reticulum retaining peptide in frame with the 
polypeptide. 

While reducing the present invention to practice it was discovered that 
in order to obtain activity of a P-glucosidase in a transgenic plant, the 
expression construct should include a signal peptide. In addition, it was found 

15 that retaining the enzyme in the endoplasmic reticulum results in higher release 
of aroma compounds following homogenization and incubation. It is assumed 
that compartmentalization of the enzyme in for example the ER prevents it 
from interacting with its substrates which are mainly outside the cells, limiting 
such interaction following homogenization. Indeed, directing the enzyme to 

20 the apoplast resulted in increased release of aroma in vivo. Thus, depending 
on the specific application, one can chose weather to include in the construct 
an endoplasmic reticulum retaining peptide or not. 

According to yet a further aspect of the present invention there is 
provided a method of increasing a level of at least one aroma substance in a 

25 plant derived product, such as, but not limited to, an alcoholic beverage. The 
method according to this aspect of the present invention is effected by 
incubating a glucose containing plant starting material with a yeast cell, said 
plant overexpressing a nucleic acid construct including a genomic, 
complementary or composite polynucleotide preferably derived from 

30 Aspergillus niger which encodes a polypeptide having a P-glucosidase 



BNSOOCID: <WO 0l36586A2_l_> 



10 



WO 01/36586 PCT/ILOO/00758 

39 

catalytic activity, thereby increasing the level of the at least one aroma 
substance in the plant derived product. The polynucleotide preferably further 
encodes a signal peptide in frame with the polypeptide. Still preferably, the 
polynucleotide further encodes an endoplasmic reticulum retaining peptide in 
frame with the polypeptide. 

As used herein in the specification and in the claims section that 
follows, the phrase "glucose containing starting material" refers to any source 
of energy, in the form of glucose containing compounds, other than free 
glucose, including, but not limited to, crushed, minced, diced or extracted plant 
material, plant, or portions thereof, such as fruits, examples thereof are tropical 
fruits and grapes. 

According to an additional aspect of the present invention there is 
provided a method of producing an aroma spreading plant. As used herein in 
the specification and in the claims section that follows, the phrase "aroma 

15 spreading plant" refers to substantially any part of a plant, in which volatile 
compounds are generated by the catalytic activity of the P-glucosidasc 
polypeptide of the present invention, release of volatile compounds therefrom 
is perceived by the olfactory system of an organism, such as a human. 

The method according to this aspect of the present invention is effected 

20 by overexpressing in the plant a nucleic acid construct including a genomic, 
complementary or composite polynucleotide derived from Aspergillus niger, 
which encodes a polypeptide having a p-glucosidase catalytic activity, thereby 
increasing aroma spread from the plant. Such overexpression is preferably 
performed in a tissue specific manner by, for example, employing a tissue 

25 specific promoter, as hereinabove described, to thereby overexpress a 
heterologous protein in a selected portion of the plant. The tissue in which 
such overexpression is effected is selected according to the availability of 
glucose containing non-volatile aroma substrates therein. Thus, such an 
overexpression will cause the release of a volatile and aroma constituent of the 

30 substrate. Thus, according to preferred embodiments overexpressing the 
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nucleic acid construct is limited to at least one tissue, such as a flower, a fruit, 
a seed, a root, a stem, pollen and leaves. 

According to still a further aspect of the present invention there is 
provided a method of increasing a level of free glucose in a glucose containing 

5 fermentation starting material. The method according to this aspect of the 
present invention is effected by fermenting the glucose containing 
fermentation starting material by a cell overexpressing a nucleic acid construct 
including a genomic, complementary or composite polynucleotide preferably 
derived from Aspergillus niger, which encodes a polypeptide having a (5- 

10 glucosidase catalytic activity, thereby increasing the level of the free glucose in 
the glucose containing fermentation starting material. The polynucleotide 
preferably further encodes a signal peptide in frame with the polypeptide. Still 
preferably, the polynucleotide further encodes an endoplasmic reticulum 
retaining peptide in frame with the polypeptide. 

15 According to another aspect of the present invention there is provided a 

method of increasing a level of free glucose in a plant derived glucose 
containing fermentation starting material. The method according to this aspect 
of the present invention is effected by fermenting the plant derived glucose 
containing fermentation starting material by a cell, the plant overexpressing a 

20 nucleic acid construct including a genomic, complementary or composite 
polynucleotide preferably derived from Aspergillus niger, which encodes a 
polypeptide having a p-glucosidase catalytic activity, thereby increasing the 
level of the free glucose in the plant. The polynucleotide preferably further 
encodes a signal peptide in frame with the polypeptide. Still preferably, the 

25 polynucleotide further encodes an endoplasmic reticulum retaining peptide in 
frame with the polypeptide. 

As used herein in the specification and in the claims section that 
follows, the term "free glucose' 1 refers to glucose residues in the form of a 
monosaccharide, the levels of which are increased by the catalytic activity of (i 

30 -glucosidase. 
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As used herein in the specification and in the claims section that 
follows, the phrase "glucose containing fermentation starting material" refers 
to any source of energy, in the form of glucose containing compounds, other 
than free glucose, including, but not limited to, crushed, minced, diced or 
extracted plant material, plant, or portions thereof, used in industrial 
fermentation processes. 

According to yet another aspect of the present invention there is 
provided a method of increasing a level of extra- or intracellular free glucose 
in a plant. The method according to this aspect of the present invention is 
effected by overexpressing in the plant a nucleic acid construct including a 
genomic, complementary or composite polynucleotide preferably derived from 
Aspergillus niger, which encodes a polypeptide having a p-glucosidase 
catalytic activity, thereby increasing the level of the free glucose in the plant. 
Thus, sweeter fruits can be produced. The polynucleotide preferably further 
15 encodes a signal peptide in frame with the polypeptide. Still preferably, the 
polynucleotide further encodes an endoplasmic reticulum retaining peptide in 
frame with the polypeptide. 

Glycosidases, including P-glucosidase, catalyze reactions involving the 
hydrolysis of O-glycosidic bond of glycosides, and synthesize oligosaccharides 

20 when the reaction is run in reverse from the normal direction, a result achieved 
by, for example, site directed mutagenesis, and Km reversal. As described in 
the Background section hereinabove, the hydrolysis reaction mechanism of 
glycosidases involves two catalytic steps, the second of which involves a base 
catalyzed H 2 0 attack, resulting in the regeneration of the enzyme, and the 

25 release of the saccharide residue. Thus, in addition, oligosaccharide synthesis 
can be achieved by adding a second saccharide to the reaction mixture, which 
competes with the H 2 0 molecule, and reacts in its place with the first 
saccharide in, what is known as, a transglycosylation reaction. Hence, as 
glycosidases are generally available and easy to handle, these enzymes have 

30 the potential to catalyze the production of many different products using 
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inexpensive substrates. For further detail see U.S. pat. No. 5,7 16,812, which 
is incorporated herein by reference. 

Thus, according to yet an additional aspect of the present invention 
there is provided a method of synthesizing oligosaccharides. The method 
according to this aspect of the present invention is effected by mixing a 
polypeptide having a [3-glucosidase catalytic activity with first and second 
saccharide molecules to thereby join the first and second saccharide molecules 
into an oligosaccharide. 

Additional objects, advantages, and novel features of the present 
invention will become apparent to one ordinarily skilled in the art upon 
examination of the following examples, which are not intended to be limiting. 
Additionally, each of the various embodiments and aspects of the present 
invention as delineated hereinabove and as claimed in the claims section below 
finds experimental support in the following examples. 

EXAMPLES 

Reference is now made to the following examples, which together with 
the above descriptions, illustrate the invention in a non limiting fashion. 

Generally, the nomenclature used herein and the laboratory procedures 
utilized in the present invention include molecular, biochemical, 
microbiological and recombinant DNA techniques. Such techniques are 
thoroughly explained in the literature. See, for example, "Molecular Cloning: 
A laboratory Manual" Sambrook et aL, (1989); "Current Protocols in 
Molecular Biology" Volumes Mil Ausubel, R. M., ed. (1994); Ausubel et a/., 
"Current Protocols in Molecular Biology", John Wiley and Sons, Baltimore, 
Maryland (1989); Perbal, "A Practical Guide to Molecular Cloning", John 
Wiley & Sons, New York (1988); Watson et al, "Recombinant DNA", 
Scientific American Books, New York; Birren et aL (eds) "Genome Analysis: 
A Laboratory Manual Series", Vols. 1-4, Cold Spring Harbor Laboratory 
Press, New York (1998); methodologies as set forth in U.S. Pat. Nos. 
4,666,828; 4,683,202; 4,801,531; 5,192,659 and 5,272,057; "Cell Biology: A 
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Laboratory Handbook", Volumes I-III Cellis, J. E., ed. (1994); "Current 
Protocols in " Volumes I-III Coligan J. E., ed. (1994); Stites et ah (eds), "Basic 
and Clinical Immunology" (8th Edition), Appleton & Lange, Norwalk, CT 
(1994); Mishell and Shiigi (eds), "Selected Methods in Cellular Immunology", 
5 W. H. Freeman and Co., New York (1980); available immunoassays are 
extensively described in the patent and scientific liter, see, for example, U.S. 
Pat. Nos. 3,791,932; 3,839,153; 3,850,752; 3,850,578; 3,853,987; 3,867,517; 
3,879,262; 3,901,654; 3,935,074; 3,984,533; 3,996,345; 4,034,074; 4,098,876; 
4,879,219; 5,011,771 and 5,281,521; "Oligonucleotide Synthesis" Gait, M. J., 

io ed. (1984); "Nucleic Acid Hybridization" Hames, B. D., and Higgins S. J., eds. 
(1985); "Transcription and Translation" Hames, B. D., and Higgins S. J., eds. 
(1984); "Animal Cell Culture" Freshney, R. I., ed. (1986); "Immobilized Cells 
and Enzymes" IRL Press, (1986); "A Practical Guide to Molecular Cloning" 
Perbal, B., (1984) and "Methods in Enzymology" Vol. 1-317, Academic Press; 

15 "PCR Protocols: A Guide To Methods And Application", Academic Press, San 
Diego, CA (1990); Marshak et ah, "Strategies for Protein Purification and 
Characterization - A Laboratory Course Manual" CSHL Press (1996); all of 
which are incorporated by reference as if fully set forth herein. Other general 
references are provided throughout this document. The procedures therein are 

20 believed to be well known in the art and are provided for the convenience of 
the reader. All the information contained therein is incorporated herein by 
reference. 

MATERIALS AND EXPERIMENTAL METHODS 
Purification of A. niger fi-glucosidase: 
25 A crude preparation of A. niger Bl (CMI CC 324626) (3-glucosidase 

was obtained from Shaligal Ltd. (Tel-Aviv, Israel). A sample (10 ml) of the 
crude enzyme (140 Units/ml) was first diafiltered through a 50 kDa cut-off 
Amicon membrane (Amicon Corp., Danvers, MA), with 20 mM citrate buffer 
pH=5. The proteins were then separated on an FPLC equipped with a Mono-Q 
30 RH 5/5 column (Amersham Pharmacia Biotech AB, Uppsala, Sweden), 
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equilibrated with the same buffer. The enzyme was eluted with a linear 
gradient of 0 to 350 mM NaCl. Active fractions (see below, enzyme assays) 
were monitored and pooled (between 80-110 mM NaCl). The partially 
purified enzyme was dialyzed against 20 mM citrate buffer pH=3.5, applied to 
5 a Resource-S column equilibrated with the same buffer, and eluted with a 
gradient of 0-1 M NaCl. The purified enzyme (eluted at 155 mM NaCl) was 
concentrated by ultrafiltration (50 kDa cut-off membrane, Amicon). 
Enzyme assays: 

P-glucosidase enzyme activity was monitored using a plate assay as 
10 follows. 4-methylumbelife p-D-glucopyranoside (MUGlc, Sigma Chemical 
Inc. St. Louis, Missouri) to a final concentration of 0.5 mM, was dissolved in 
PC buffer (50 mM phosphate, 12 mM citric acid, pH=3.4) at 45 °C. The 
solution was mixed with 3 % agar in water, previously boiled and then cooled 
to 45 °C. The resulting solution (20 ml) was poured into a petri dish and after 
15 solidification, 10 enzyme samples were spotted. The plate was incubated at 
50 °C for one hour, and then illuminated with long UV. An intense 
fluorescence was indicative of p-glucosidase activity. 

Detection of P-glucosidase in polyacrylamide gels was carried out by 
washing the SDS-polyacrylamide gel with 1:1 isopropanohPC buffer to 
20 remove SDS and renature the enzyme. The gel was washed once in PC buffer 
and incubated in a thin layer of a solution of 0.5 mM MUGlc. After 
incubation at 50 °C for one hour, the active protein band was visualized by UV 
light. 

Quantitative assays were performed using pNPGlc as a substrate 
25 according to Shoseyov (7). 

Determination of thermal stability of A. niger fi-glucosidase: 
Recombinant enzyme (40 (ig/ml) was dissolved in 20 mM citrate 
phosphate buffer, pH=5. Each tested sample (8 \\X) was covered by 15 yA 
mineral oil. The activity was determined by the standard pNPGlc assay (7). 
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Deglycosylation of A. niger (yglucosidase by N-glycosidase-F: 
A N-glycosidase-F (Boehringer Mannheim, Mannheim, Germany) 
reaction mixture, containing 0.125 fig pure P-glucosidase (previously 
denatured by boiling for 3 minutes in 1 % SDS and 5 % P-mercaptoethanol), 
5 0.2 units of the N-glycosidase-F, sodium phosphate buffer (50 mM, pH=7.5), 
EDTA (25 mM), 1 % Triton X-100 and 0.02 % sodium azide, in a total volume 
of 12.5 jil, was incubated for 4 hours at 37 °C. Reaction was stopped by 
addition of PAGE sample application buffer followed by 3 minutes of boiling. 
Proteolysis and N-terminal sequences of A. niger Bl §-glucosidase: 

10 Partial enzymatic proteolysis with Staphylococcus aureus V8 protease 

was carried out as described by Cleveland (28). Briefly, FPLC-purifled 
glucosidase (5 *ig), was concentrated by acetone precipitation. The protein 
was separated on a preparative 10 % SDS-PAGE. The gel was stained with 
coomassie blue, destained and rinsed with cold water, and the P-glucosidase 

15 protein band was excised. The resulting gel slice was applied to a second 
SDS-PAGE gel (15 % acrylamide) and overlaid with Staphylococcus aurous 
V8 protease. Digestion was carried out within the stacking gel by turning off 
the current for 30 min. As the bromophenol blue dye neared the bottom of the 
stacking gel, the current was restored. The electrophoresed cleavage products 

20 were electroblotted to PVDF membranes. The native protein was transferred 
to PVDF in parallel. The N-terminal sequence of the native protein and two of 
the numerous cleavage products were analyzed by Edman degradation using a 
gas-phase protein sequencer (Applied Biosystems model 475A 
microsequencer). 

25 Cloning of bgll cDNA and genomic gene: 

Total RNA isolation: Total RNA was isolated from Aspergillus niger 
Bl as follows: A. niger Bl was grown in liquid culture consisting of mineral 
media (NH 4 ) 2 S0 4 -3H 2 0 (0.5 g/1), KH 2 P0 4 (0.2 g/1), MgS0 4 (0.2 g/1), 
CaCl 2 H 2 0 (0.1 g/1), FeS0 4 -6H 2 0 (0.001 g/1), ZnSQ 4 7H 2 0 (0.001 g/1), and 
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2 mM citric acid, at pH=3.5 with 1 % w/v bran as a carbon source. The 
medium was autoclaved, cooled and inoculated with A. niger Bl (10^ 
spores/ml). Baffled flasks were used with constant shaking (200 RPM) at 37 0 
C. The appearance of p-glucosidase activity was monitored by placing 5 ul of 
growth medium on 1 % agar plates containing 0.5 mM MUGlc, as described 
above. Activity was detected following 15 hours incubation. The mycelium 
was harvested following 24 hours growth period, and the medium removed by 
filtering through GFA glass microfibre (Whatman Inter. Ltd., Maidstone, 
England). The mycelium was then frozen with liquid nitrogen and ground to 
fine powder with mortar and pestle. Total RNA was produced from this 
powder by the Guanidine thiocyanate (TriReagent™) method (Molecular 
Research Center, Inc.). 

RNA reverse-transcription reaction: cDNA was obtained by reverse 
transcribing total RNA (10 ^g) using Stratagene RT-PCR kit (Stratagene, La 
Jolla, CA). The reaction mixture (50 ul) additionally consisted of: Oligo dT18 
(1 ug), RNase Block Ribonuclease Inhibitor (20 units), lx buffer (50 mM Tris- 
HC1, pH=8.3, 75 mM KC1, 10 mM DTT, 3 mM MgCl 2 ), dNTPs (500 uM 
each) and reverse transcriptase (300 units). Total RNA was initially denatured 
at 70 °C, allowed to cool to room temperature (for primers annealing), and 
added to the reaction mixture. The reaction mixture was incubated for 1 hour 
at 37 °C, followed by heating (95 °C, 5 minutes) and stored at -70 °C until 
further use. 

DNA amplification: Degenerate primers for DNA amplification 
reaction by PCR methods were synthesized, based on part of the amino acid N- 
terminal sequence and an internal sequence, as determined by the Edman 
degradation, following V8 proteolysis (hereinbelow, experimental results). 
The partial sequence from 0-glucosidase N-terminal derived amino acid 
sequence was Ser-Pro-Pro-Tyr-Tyr-Pro (SEQ ID NO:4), yielding the following 
primer: 5 , -(C/G)(A/C/G/T)CC(A/C/G/T) CC(A/C/G/T)TA(C/T)TA(C/T)CC-3' 
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(SEQ ID NO:5). The partial sequence from E2 internal cleavage product 
amino acid sequence was Gln-Pro-Ile-Leu-Pro-Ala-Gly-Gly (SEQ ID NO: 6), 
yielding the following primer: 5'-TCCIGC(T/G/C/A)GG(TG/C/A)A(G/A) 
(T/G/A)AT(T/G/C/A)GG(T/C)TG-3 I (SEQ ID NO: 7). 

5 DNA amplification reaction mixture (25 jil) contained: reverse 

transcriptase reaction product (1 nl), 10 x PCR buffer (2.5 jil, Promega Corp., 
Madison, WI), dNTPs (250 \xM each), MgCl2 (2.0 mM), degenerate primers 
(250 pmol each), DNA polymerase (3 units, Stratagene, La Jolla, CA) and 
overlaid with mineral oil (25 |il). The reaction was performed in an automated 

10 heating block (Programmable thermal controller - MJ Research, Inc.). PCR 
cycling conditions were 30 seconds denaturing at 94 °C, 60 seconds annealing 
at 50 °C, and 150 seconds elongation at 72 °C, repeated 36 times. The 
resulting amplified product was electrophoresed on a 1 .2 % (w/v) agarose/TBE 
gel, resulting in a 2.2 kb cDNA gene fragment, which was further isolated 

15 using Gel Extraction Kit (QIAGEN, Hilden, Germany) and cloned directly into 
the single 3'-T PCR insertion site of pGEM-T cloning vector (Promega Corp., 
Madison, WI). 

Probe preparation: The 2.2 kb partial cDNA was digested with Pstl to 
produce a 1.2 kb fragment DNA probe. A sample (25 ng) of the fragment was 
20 labeled with [32p]dCTP, using the random sequence nanonucleotide rediprime 
DNA labeling system (Amersham Pharmacia Biotech AB, Buckinghamshire, 
England). 

Preparation of genomic DNA plasmid library: An A, niger Bl 
genomic library was constructed in the pYEAUra3 yeast/£. coli shuttle vector 

25 (Clontech Lab. Inc. Palo Alto, CA). A, niger Bl was grown in liquid culture 
as described above, the mycelium harvested following 48 hours of growth, 
frozen in liquid nitrogen and grounded. The mycelium ground was used to 
produce genomic DNA by the CTAB method of Murray and Thompson (29). 
The library was constructed from partially digested SauiA genomic DNA, 

30 cloned into the BamHl site of the pYEUra3 yeast shuttle vector (Clontech Lab. 
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Inc. Palo Alto, CA). pYEAUra3 yeast/£. coli shuttle vector was digested with 
BamHl and dephosphorylated with CIP to prevent self ligation. The partially 
digested genomic DNA was cloned into the shuttle vector with T4 ligase and 
used to transform TOP 10 E. coli electro-competent cells, which were then 
5 plated on LB-agar containing ampicillin (50 ^g/rnl). A total of 4 x 10 4 
colonies were grown on LB-agar plates, blotted to Hybond-N membranes 
(Amersham Pharmacia Biotech AB, Buckinghamshire, England) and screened 
using the above described 1.2 kb probe. Positive clones were subcloned in 
pUC18 and sequenced (Biological Services, The Weizmann Institute of 
10 Science, Rehovot, Israel). 

Expression of bgll cDNA in E. coli: 

Two specific primers were designed according to the 5' and the 3' 
sequences, corresponding to the N-terminal and C-terminal region of the 
mature protein: sense primer: 5'-' (SEQ ID NO:8). Antisense primer: 5*- 

15 AAAGGATCCTTAGTGAACAGTAGGCAGAGACGC-3' (SEQ ID NO:9). 
The isolated cDNA was digested with Ncol and BamHl and cloned into a 
pET3d expression vector (Figure 1A, Novagen Inc., Madison, WI). Positive 
E. coli BL21(DE3) pLysS colonies, containing the bgll cDNA, were 
confirmed by enzyme restriction and sequence analysis. Recombinant BGL1 

20 was expressed according to the manufacturer's protocol. 

Expression of bgll cDNA in Saccharomyces cerevisiae and Pichia 
pastoris: 

The pYES2 vector (Invitrogen Inc., San Diego, CA) was used to 
successfully clone the bgll cDNA gene into the HindllV BamHl of pYES2- 

25 bgll plasmid (Figure lb), and transform Saccharomyces cerevisiae using the 
lithium acetate method (30). The BGL1 was expressed by inducing the Gall 
promoter according to the manufacturer's protocol. Saccharomyces cerevisiae 
strain INVSc2 (MATa, his3-D200, ura3-167) was used as the host. Pichia 
pastoris strain GS115 (his4 mutant) was used as the host for shuttle and 

30 expression vector plasmid pHIL-Sl (Invitrogen Inc., San Diego, CA). The 
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bgll cDNA was cloned into the EcoKUBamUl sites of pHIL-Sl, yielding the 
pHIL-Sl-bgll expression and secretion vector (Figure lc). Expression in P. 
pastoris was carried out according to the manufacturer's protocol. Screening 
of P-glucosidase-expressing clones was facilitated by top-agar, containing 50 
mg X-Glc, 30 ml methanol and 1 % agar per liter. Blue color indicated a 
colony producing active (i-glucosidase. 
Western blot analysis: 

Antibodies were produced from rabbit serum 36 days following a 
second injection of 100 |ig purified protein and adjuvant (AniLab Biological 
Services, Tal-Sachar, Israel). High molecular weight ladder was from Sigma 
Chemical Inc. St. Louis, Missouri. Western blot conditions were as described 
in reference 36. 

Determination of the Stereochemical Course of Hydrolysis: 
The method was essentially as described by Wong et al. (31). PNPGlc 
(10 ^imols) was dissolved in 0.5 ml of 25 mM acetate buffer pH=3.5 in D2O in 
anNMRtube. p-Glucosidase was lyophilized and redissolved in 100 jal D2P 
(35 units/ml). The 1H-NMR spectrum of the substrate was recorded, enzyme 
added (10 (il), and spectra recorded at specified time intervals on a Bruker 
AMX400 at 25 °C. 

Inactivation and reactivation studies: 

Pure A. niger (3-Glucosidase enzyme (0.47 mg/ml) was incubated in the 
presence of various concentrations of 2-deoxy-2-fluoro-P-glucosyl fluoride 
(2FGlcF, 0.5-6 mM) in 30 mM citrate buffer pH=4.8 at 50 °C. Residual 
enzyme activity was determined at different time intervals by addition of an 
aliquot (10 of the inactivation mixture, to a solution containing citrate 
buffer (30 mM, pH=4.8), BSA (8 ng) and 2,4-dinitrophenyl (3-D- 
glucopyranoside (DNPGlc, 0.625 mM, 830 ^1). Release of DNP was 
determined spectrophotometrically by measuring the absorbance at 400 nm one 
minute after the addition of the substrate. 
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Reactivation rates were determined as follows: pure A. niger P- 
glucosidase (0.34 mg/ml) was preincubated with 2FGlcF (5 mM) for 15 min, 
after which the excess of the inactivator was diafiltered by 20-kDa nominal 
molecular mass cutoff centrifugal concentrators (Sartorius Inc., Goettingen, 

5 Germany). Samples of the purified, inactivated enzyme were incubated in the 
presence linamarin (0-16 mM) in citrate buffer (30 mM, pH=4.8) at 50 °C for 
0, 10, 20 and 30 minutes, and the activity of each sample was determined using 
p-nitrophenyl p-D-glucopyranoside (pNPGlc) as a substrate. 
Expression of bgll cDNA in tobacco plants: 

io Genetic constructs: 

Bgll cDNA was cloned in pETBl (37). pJD330 and pBINPlus (38) 
were used as an intermediate and binary vector, respectively. Cell signal 
sequence as well as 35S plus Q fragment were retrieved from pB21, modified 
pBluescript SK (39). Nicotiana tabacum cv. Samson was used as a model 

15 plant for gene transformation. Three gene constructs were employed (Figures 
lla-c): (i) bgll without any signal peptide which served for cytoplasmic 
expression (Figure 11a, plasmid pJDBl); (ii) bgll including a cell signal 
peptide at the N terminus for secretion into the apoplast (Figure lib, plasmid 
pJDCBl); and (iii) bgll including the cell signal peptide and the KDEL (SEQ 

20 ID NO:24) ER-retaining peptide at the C-terminus for accumulation in the ER 
(Figure 11c, plasmid pJDCBIT). 

To this end, bgll cDNA (2.5 kb) was released from pETBl (37) with 
Ncol and BamUl and inserted into pJD330 between the 35S promoter Q 
fragment and the nos terminator, eliminating the gus gene, resulting in 

25 plasmid pJDBl. Endoplasmic reticulum retaining signal tetrapeptide HDEL 
(SEQ ID NO: 17) was synthesized and fused with bgll at the C-terminal in 
pJDBl by a fidelity PCR reaction with the following pair of primers: Forward 
primer (23 mer), starting from nucleotide 1248 of bgll cDNA 5' - (1248) - 
CAGTGACCGTGGATGCGACAATG - (1270') - 3' (SEQ ID NO:20); 

30 Reverse primer (40 mer), starting at nucleotide 2506 of bgll cDNA encoding 
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also for the HDEL (SEQ ID NO: 17) peptide 5' - (2506) - 
AGAGACGGATGACAAGTACTACTTGAAATTGGGCCCAAAA-3 ' (SEQ 
ID NO:21). For pJDCBIT (35S Q + Cell + bgll + HDEL), the 35S Q 
fragment of pJDBl was replaced by a 35S Q. + Cell fragment digested from 
pB21 with BamHl and Xbal. For pJDCBl (35S Q + Cell + bgll), the 
fragment containing 35S Q and Cell as well as part of bgll was cut from 
pJDCBIT with HindlW and Nrul and ligated with the vector of pJDBl 
digested with the same pair of restriction enzymes. The nucleotide sequence 
of all of the genetic constructs was confirmed by DNA sequencing. 

Gene cassettes in the intermediate vectors of pJDBl, pJDCBl and 
pJDCBIT were further isolated with Hindlll and EcoRI and inserted into 
multiple cloning sites of the binary vector pBINPlus. Disarmed Agrobateruim 
LB4404 was transformed with pBINPlus containing bgll gene cassettes. 
Tobacco plant transformation: 
15 The young leaves of in vitro grown plantlets were excised and cut into 

0.5 cm 2 pieces and then immersed for 5 minutes in an overnight grown culture 
of Agrobacterium. After blotted with sterile Whatman filter paper, the 
infected leaves were co-cultured for 2 days with Agrobacterium on MS 
medium plus 2.0 mg/L of Zeatin and 0.1 mg/L of IAA as well as 0.35 % (w/v) 
20 phytagel and then transferred to the same medium but with 300 mg/L 
kanamycin and 300 mg/L carbenicillin. Regenerates were then transferred to 
the rooting media, containing only MS salts, vitamins and the same antibiotics. 
Rooted plants were transferred to greenhouse after PCR screening. 
Screening for transgenic plants: 
25 DNA and protein of plants were extracted according to Nagy et al. (40). 

PCR verification of gene insertion into plant genome was done with the 
following pairs of primers, which cover the DNA fragment from position 1248 
to the end of bgll: 5 ' -C AGTG ACCGTGG ATGCG AC AATG-3 ' (SEQ ID 
NO:22) and S'-AAAGGATCCTTAGTGAACAGTAGGCAGAGACGC-S' 
30 (SEQ ID NO:23). 
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Identifying transgenic plants expressing BGLI protein and activity: 
Western blot (40) and SDS-PAGE activity gel staining (37) were 
employed to screen successful transgenic lines, using the purified A. niger 
BGL1 protein as positive controls and non-transgenic plant as negative 
5 control. 

SPMI-GC/MS analysis: 

The effect of bgll on flavor compound evolution and composition was 
studied. Fresh leaves of transgenic plants and of wild type control plants were 
excised and ground in liquid nitrogen. Ice-cold extraction buffer, containing 

io 10 mM EDTA, 4 mM DTT in 50 mM phosphate buffer, pH 4.3, was added in 
a ratio of 1:3 w/w. The mixture was then shaken for 0.5 hours. 0.75 ml of 
supernatant from each of the centrifuged mixtures was taken into a glass vial. 
All manipulations were at 4 °C. After 9 hours of incubation at 37 °C, the 
volatiles in the vial were analyzed according to Clark et al. (41) using a Saturn 

15 Varian 3800 SPMI-GC-MS apparatus, equipped with a DB-5 capillary column. 
The temperature of splitless injections was 250 °C and the transfer line was 
maintained at 280 °C. Helium was used as a carrier gas. The oven was 
programmed as follows: 1 minute at 40 °C with gradually heating to 250 °C at 
a rate of 5 °C/minute. 

20 EXPERIMENTAL RESULTS 

Purification of wild type A. niger fi-glucosidase: 

A. niger P-glucosidase enzyme preparation was purified by Mono-Q 
FPLC. Active protein samples eluted from the Mono-Q column were 
separated on a 10 % SDS-PAGE gel, stained with coomassie blue, and 
25 incubated in the presence of MUGlc to demonstrate activity of the enzyme. At 
this stage of purification, a discrete band, having an apparent molecular mass 
of approximately 160 kDa and P-glucosidase activity could be detected (Figure 
2b, lanes 1-5:1- electroeluted band of BGL1 from preparative PAGE-SDS gel 
stabs; 2-5 - acetone precipitates from Mono-Q separation of BGL1). However, 
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the apparent mass of the denatured enzyme (boiled for 10 min in the presence 
of p-mercaptoethanol), was shown to be 120 kDa on 10 % SDS-PAGE (Figure 
2a). The enzyme was designated BGL1 was further purified to homogeneity 
on a Resource-S column (Figure 3). Deglycosylation of A. niger P-glucosidase 
was performed by N-glycosidase-F. As demonstrated in Figure 4 5 SDS-PAGE 
analysis indicated that approximately 20 kDa of the A. niger P-glucosidase 
mass can be attributed to N-linked carbohydrates. 

Proteolysis and N-terminal sequences ofBGLl: 

Partial enzymatic proteolysis with Staphylococcus aureus V8 protease 
of purified BGL1 was conducted. The undigested protein and cleavage 
products were separated by SDS-PAGE, followed by electroblotting onto 
PVDF membranes and determination of the N-terminal sequence of the native 
protein and two of the cleavage products. Amino acid sequences obtained 
were as follows: 

15 N-terminal native protein: Asp-Glu-Leu-Ala-Tyr- Ser-Pro-Pro-Tyr-Tyr- 

Pro-Ser-Pro-Trp-Ala-Asn-Gly-Gln-Gly-Asp (SEQ ID NO: 10). Underlined 
portion represents SEQ ID NO:4. 

Internal cleavage product - El polypeptide: Val-Leu-Lys-His-Lys-Asn- 
Gly-Val-Phe-Thr-Ala-Thr-Asp-Asn-Trp-Ala-Ile-Asp-Gln-Ile-Glu-Ala-Leu- 
20 Ala-Lys (SEQ ID NO: 1 1). 

Internal cleavage product - E2 polypeptide: Gly-Ala-Thr-Asp-Gly-Ser- 
Ala- Gln>Pro-Ile-Leu-Pro-Ala-Glv-Glv -Glv-Pro-Gly-Gly-Asn>Pro (SEQ ID 
NO: 12). Underlined portion represents SEQ ID NO:6. 

FastA analysis (32) indicated that the N-terminal sequence, as well as 
25 the internal sequences, have sequence similarity with sequences of p- 
glucosidase from the yeast Saccharomycopsis fibuligera which belonging to 
Family 3 of the glycosyl hydrolases. 

Isolation and characterization of bgll cDNA and genomic DNA: 
In order to clone the A. niger p-glucosidase gene, degenerate primers 
were designed according to the sequence of digest fragments of the 



30 
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polypeptide. These oligonucleotides were used to amplify a cDNA fragment 
of the (5-glucosidase gene by RT-PCR. A 1.2 kb probe was excised from the 
resultant 2.2 kb amplification product and was used to screen a genomic 
library, constructed in pYEUra3 yeast/£. coli shuttle vector. Positive clones 

5 were successfully subcloned and sequenced, resulting in full length bgll 
genomic sequence (SEQ ID NO:3, Figure 5a). Amplification primers were 
then generated, according to the genomic DNA sequence, corresponding to the 
N- and C-terminal of the mature protein. RT-PCR was thereafter used for 
amplifying the full length P-glucosidase cDNA sequence (SEQ ID NO:l, 

10 Figure 5a, GenBank Accession No. AJ132386). The cDNA sequence 
perfectly matched the DNA sequence of the combined exons. The open 
reading frame was found to encode a polypeptide with a predicted molecular 
weight of 92 kDa. The gene includes 7 exons intercepted by 6 introns (Figure 
5b). Analysis of the DNA sequence upstream to the sequence encoding for the 

15 mature protein revealed a putative leader sequence, intercepted by an 82 bp 
intron. 

Production ofrBGLl in E. coli: 

Recombinant BGL1 was overexpressed in E. coli. No apparent (}- 
glucosidase activity could be detected in the E. coli extracts, however SDS- 
20 PAGE analysis revealed a relatively intense protein band expressed at the 
expected molecular weight. Western blot analysis using rabbit polyclonal anti- 
native BGL1 antibodies (AniLab Biological Services, Tal-Sachar, Israel), 
positively identified the 90 kDa protein band (not shown). Further analysis 
revealed that the protein was accumulated in inclusion bodies. Several 
25 refolding experiments were conducted, however, these efforts to produce 
active protein from E. coli failed (not shown). 

Expression of recombinant BGL1 in S. cerevisiae and P. pastoris: 
Recombinant BGL1 was successfully expressed both in S. cerevisiae 
and P. pastoris. In S. cerevisiae a relatively low level of expression was 
30 found. The recombinant protein was detected by a Western blot analysis 
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(Figure 6a). The total protein extract of S. cerevisiae expressing bgll cDNA 
had a p-glucosidase activity of 1.9 units/mg protein. No p-glucosidase activity 
was detected in control S. cerevisiae, transformed with vector only, under the 
same assay conditions. However, no protein band corresponding to 
5 recombinant BGL1 could be detected by coomassie blue staining. P. pastoris 
transformed with bgll secreted relatively high levels of recombinant BGL1 to 
the medium (about 0.5 g/1) appearing as an almost pure protein in the culture 
supernatant (Figure 6b). This recombinant enzyme was highly active (124 
units/mg protein) and without further purification, yielded specific activity 
10 similar to that of the pure native enzyme. 

1H-NMR determination of stereochemical outcome: 
1H-NMR spectra of a reaction mixture containing pNPGlc and BGL1 
revealed that the beta anomer of glucose was formed first (H-l = 4.95 ppm), 
with delayed appearance of the alpha anomer (H-l 5.59 ppm), the consequence 
15 of mutarotation (Figure 7). BGL1 is indeed, therefore, a retaining glycosidase, 
as has been observed for other family members (33, 34). 

Inactivation and reactivation of A. niger fi-glucosidase: 
Enzyme was incubated in the presence of various concentrations of 
2FGlcF and residual enzyme activity was monitored at different time intervals. 
20 Enzyme activity decreased in a time-dependent manner, according to pseudo- 
first order kinetics, allowing the determination of pseudo-first order rate 
constants: K; = 4.5 min* 1 and K, = 35.4 mM, for inactivation at each inactivator 
concentration (0, 0.5, 1, 2, 4, and 6 mM, Figure 8). 

Rates of reactivation of 2-deoxy-2-fluoroglucosyl-BGLl were 
25 determined in the presence of different concentrations of linamarin by 
monitoring activity regain after 0, 10, 20 and 30 min (Figure 9). The regain of 
activity followed a first order process at each linamarin concentration. 
Thermal stability of A. niger f>glucosidase: 

Thermal stability of the recombinant enzyme was evaluated at different 
30 temperatures, presented as percent enzymatic activity relative to an enzyme 
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solution kept at 4 °C. Results obtained are summarized in Table 2 and 
illustrated in Figure 10. The purified enzyme exhibits high thermal stability, as 
majority (above 50 %) of the activity is maintained at a temperature ranging 
from 4-60 °C. 
5 TABLE 2 



Temp°C 


% activity 


4 


100 


50 


91.5 


55 


83.5 


60 


68 


65 


17.8 



Expression of BGL1 in tobacco plants: 

Agrobacterium mediated leaf disc transformation resulted in transgenic 
tobacco plants as was proved by PCR (Figure 12) for the presence of the 
10 transgene, Western blotting (Figures 13a-b) for presence of the protein and 
activity assays (Figures 14 and 15) for presence of protein activity. Table 3 
below summarizes the results. 



TABLE 3 



Gene construct 


BGL1 


Cell + BGL1 +HDEL ( 


Cell + BGL1 


Number of 
Regenerates 


33 


14 


27 


PCR positive 


29 


9 


23 


Western Blot 
positive 


4 


9 


18 


Activity gel 
positive 


0 


9 


18 



Of the 29 PCR positive regenerates transformed with cDNA encoding 
BGL1, which fails to encode a signal peptide, only in 4 the BGL1 protein was 
detectable via Western blotting, however no BGL1 activity was measurable in 
any of which. The BGL1 was found smaller in molecular weight compared to 
wild type A. niger beta-glucosidase and of processed recombinant BGL1 
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containing a signal peptide. Its apparent size of about 95 kDa is very close to 
92 kDa which is the calculated molecular weight of the un-glycosylated A. 
niger beta-glucosidase. This result coincides with the fact that a protein with 
no signal peptide is expected to be released from the ribosomes and remain in 

5 the cytoplast (42) un-glycosylated, as protein glycosylation is conducted in the 
lumen of the endoplasmic reticulum (43). 

Of the 9 PGR positive regenerates transformed with a cDNA encoding 
the BGL1 and a Cell signal peptide and in addition encodes the HDEL ER 
retaining peptide, all plants expressed detectable amounts of BGL1 protein and 

10 activity. 

Of the 23 PCR positive regenerates transformed with a cDNA which 
encodes the BGL1 protein and the Cell signal peptide but not the HDEL ER 
retaining peptide, 18 plants expressed detectable amounts of BGL1 protein and 
activity. 

15 The effect ofBGLl on flavor compound evolution and composition in 

transgenic tobacco plants: 

Extracts of transgenic plants (CB14 and CBT21 containing similar 
BGL1 activity, see Figure 15) were incubated for 9 hours at 37 °C, and flavor 
compounds were analyzed by SPMI-GC/MS. The results, which are 

20 summarized in Table 4 below, show that with the exception of oleyl alcohol, 
the concentration of different flavor compounds is increased in transgenic 
plants expressing active BGL1 compared with the control. Furthermore, it 
seems that compartmentalization of BGL1 in the ER (or for that matter, any 
other subcellular organelle), rather then its secretion to the apoplast, results in 

25 higher release of flavor compounds. It is likely that this is resulted from the 
localization many flavor compounds in the apoplast, thus, secretion of BGL1 
to the apoplast cause in vivo release of flavor compounds, while 
compartmentalization of BGL1 in the ER results in release of flavor 
compounds only in the event of cell disruption and decompartmentalization. 
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TABLE 4 



Retention Time 


Scan 


Name 


CB14 


CBT21 


(minutes) 










3.917 


419 


Hexanal 


a 


- 


4.749 


508 


3-methyl-pentanoic acid 




- 


4.863 


520 


2-Hexenal 


- 




5.167 


552 


? 


- 


+ 


6.564 


702 


1-Heptanol 


- 


- 


7.1 


752 


? 


+ 




8.085 


865 


2-ethyl-1-pexanol 


- 


+ 


8.132 


870 


Limonene 


++ 


+ 


8.194 


877 


2-methyl-phenol 


- 


+ 


10.653 


1139 


Menthol 


+ 


+ 


11.757 


1258 


Nerol 


- 


+ 


12.039 


1288 


6-Quino!inol 


- 




12.1 


1294 


2-butyl-1-octanol 


- 


+ 


13.0 


1458 


? 


_ 


+ 


13.7 


1466 


? 


- 


+ 


14.091 


1507 


Vitispirane 




+ 


14.094 


1516 


4-{2,6,6-trimethyM -cyclohexen-1 -yf] 










3-Buten-1-one 


+ 


++ 


15.985 


1710 


7 






19.327 


2069 


Oleyl alcohol 


c 





CB14 - transgenic plant containing Cell signal peptide +.BGL1; CBT 21 - transgenic plant containing 
Cell signal peptide + BGL1 + HDEL ER retaining peptide, a - "-" means no significant difference in 
concentration compared with wile type, b - means significant increase compared with the wild 
5 type, c - " means significant decrease compared with the wild type, d - "++" means significant 
increase compared with a respective mark "+". ? - unknown compound. 

Although the invention has been described in conjunction with specific 
embodiments thereof, it is evident that many alternatives, modifications and 

10 variations will be apparent to those skilled in the art. Accordingly, it is 
intended to embrace all such alternatives, modifications and variations that fall 
within the spirit and broad scope of the appended claims. All publications, 
patents, patent applications and sequences identified by GenBank accession 
numbers mentioned in this specification are herein incorporated in their 

15 entirety by reference into the specification, to the same extent as if each 
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individual publication, patent, patent application or sequence was specifically 
and individually indicated to be incorporated herein by reference. In addition, 
citation or identification of any reference in this application shall not be 
construed as an admission that such reference is available as prior art to the 
5 present invention. 
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1. An isolated nucleic acid comprising a genomic, complementary 
or composite polynucleotide being derived from Aspergillus niger, encoding a 
polypeptide having a P-glucosidase catalytic activity. 

2. The isolated nucleic acid of claim 1, wherein said polynucleotide 
is as set forth in SEQ ID NOs:l, 3 or a portion thereof. 

3. A nucleic acid construct comprising the isolated nucleic acid of 
claim 1. 

4. The nucleic acid construct of claim 3, further comprising at least 
one cis acting control element for regulating expression of said polynucleotide. 

5. A host cell comprising the nucleic acid construct of claim 3. 

6. The host cell of claim 5, wherein the host cell is selected from 
the group consisting of a prokaryotic cell and a eukaryotic cell. 

7. The host cell of claim 6, wherein said prokaryotic cell is E. coli. 

8. The host cell of claim 6, wherein said eukaryotic cell is selected 
from the group consisting of a yeast cell, a fungous cell, a plant cell and an 
animal cell. 

9. An organism comprising the nucleic acid construct of claim 3. 

10. The organism of claim 9, wherein the organism is a plant. 
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11. A recombinant protein comprising an Aspergillus niger derived 
polypeptide having a P-glucosidase catalytic activity. 

12. The recombinant protein of claim 1 1, wherein said polypeptide is 
as set forth in SEQ ID NO: 2 or a portion thereof having said P-glucosidase 
catalytic activity. 

13. A method of producing recombinant P-glucosidase, the method 
comprising the step of introducing, in an expressible form, a nucleic acid 
construct into a host cell, said nucleic acid construct including a genomic, 
complementary or composite polynucleotide being derived from Aspergillus 
niger, encoding a polypeptide having a p-glucosidase catalytic activity. 

14. The method of claim 13, further comprising the step of 
extracting said polypeptide having said P-glucosidase catalytic activity. 

15. The method of claim 13, wherein said host cell is selected from 
the group consisting of a prokaryotic cell and a eukaryotic cell. 

16. The method of claim 1 5, wherein said prokaryotic cell is E. coli. 

17. The method of claim 15, wherein said eukaryotic cell is selected 
from the group consisting of a yeast cell, a fungous cell, a plant cell and an 
animal cell. 

18. The method of claim 13, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said P-glucosidase catalytic 
activity. 
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19 - The method of claim n 

forth in SEQ ,D NOs- > , * P °* nucl ~««e - as set 

cy ,u NOs - 1 , 3 or a portion thereof. 

20 A method of producing . recombinan , p . g , ucosidase 
overexpressmg «„, ta metho<j ^ ° ^ 

13::: ;°™' a nuc,eic acid ~ * - - -i « - 

hi d 2 8en ° m,C - COmP ' ementa ' y °' Po'Vnueieotide 

■ng der.ved fr om Aspergillus ^ ^ , 

glucosidase catalytic activity. P 

21. The me ,hod of Cairn 20, wherein said host ce„ is se.ected fro m 
the group cons,sti„g of a prokaryotic cel. and a eukaryotic eel,. 



22. The method of claim 21 , wherein said proktuyoti 



>c cell is E. coli. 



from ,h 23 ' ^ me ' h0d ° f Cla ' m Wherdn Sai " ****** " -«*d 
^ the group consisting of a yeas, ce„, a mngous ce„, a plan, ce„ and an 



in SPOm xt^ me ' h0d ° f °' aim WhCrein POl ™ e is » « ^ 
m SEQ ID N0: 2 or a portion .hereof having said p-gH^daae carafe 

25. The method of c.aim 20, wherein said polynucleotide is as set 
forth m SEQ ,D NOs: ., 3 or a portion thereof. 

26. A method of increasing a .eve! of at .east one fermentation 
ubstance ,„ . fermentation produc(> ^ ^ ^ ^ ^ 

fenmenttng a g,ucose containing fennentation starting materia, by a yeast eel, 
overexposing a nuc.cic acid constntc, including a genomic, comp.ementary 
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or composite polynucleotide being derived from Aspergillus niger, encoding a 
polypeptide having a (3-glucosidase catalytic activity, thereby increasing the 
level of the at least one fermentation substance in the fermentation product. 

27. The method of claim 26, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said {3-glucosidase catalytic 
activity. 

28. The method of claim 26, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

29. A method of increasing a level of at least one fermentation 
substance in a fermentation product, the method comprising the step of 
fermenting a plant derived glucose containing fermentation starting material by 
a yeast cell, said plant overexpressing a nucleic acid construct including a 
genomic, complementary or composite polynucleotide being derived from 
Aspergillus niger, encoding a polypeptide having a p-glucosidase catalytic 
activity, thereby increasing the level of the at least one fermentation substance 
in the fermentation product. 

30. The method of claim 29, wherein said polypeptide as set forth in 
SEQ ID NO: 2 or a portion thereof having said (3-glucosidase catalytic activity. 

31. The method of claim 29, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

32. A method of increasing a level of at least one aroma substance in 
a plant derived product, the method comprising the step of incubating a 
glucose containing plant starting material with a yeast cell overexpressing a 
nucleic acid construct including a genomic, complementary or composite 
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polynucleotide being derived from Aspergillus niger, encoding a polypeptide 
having a p-glucosidase catalytic activity, thereby increasing the level of the at 
least one aroma substance in the plant derived product. 

33. The method of claim 32, wherein said plant derived product is a 
fermentation product. 

34. The method of claim 32, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said p-glucosidase catalytic 
activity. 

35. The method of claim 32, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 



36. A method of increasing a level of at least one aroma substance in 
a plant derived product, the method comprising the step of incubating a 
glucose containing plant starting material with a yeast cell, said plant 
overexpressing a nucleic acid construct including a genomic, complementary 
or composite polynucleotide being derived from Aspergillus niger, encoding a 
polypeptide having a p-glucosidase catalytic activity, thereby increasing the 
level of the at least one aroma substance in the plant derived product. 

37. The method of claim 36, wherein said plant derived product is a 
fermentation product. 

38. The method of claim 36, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said p-glucosidase catalytic 
activity. 
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39. The method of claim 36, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

40. A method of increasing a level of free glucose in a glucose 
containing fermentation starting material, the method comprising the step of 
fermenting the glucose containing fermentation starting material by a cell 
overexpressing a nucleic acid construct including a genomic, complementary 
or composite polynucleotide being derived from Aspergillus niger, encoding a 
polypeptide having a P-glucosidase catalytic activity, thereby increasing the 
level of the free glucose in the glucose containing fermentation starting 
material. 

41. The method of claim 40, wherein said cell is selected from the 
group consisting of a prokaryotic cell and a eukaryotic cell. 

42. The method of claim 41, wherein said prokaryotic cell is E. coli. 

43. The method of claim 41, wherein said eukaryotic cell is selected 
from the group consisting of a yeast cell, a fungous cell, a plant cell and an 
animal cell. 

44. The method of claim 40, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said p-glucosidase catalytic 
activity. 

45. The method of claim 40, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

46. A method of increasing a level of free glucose in a plant derived 
glucose containing fermentation starting material, the method comprising the 
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step of fermenting the plant derived glucose containing fermentation starting 
material by a ceil, said plant overexpressing a nucleic acid construct including 
a genomic, complementary or composite polynucleotide being derived from 
Aspergillus niger, encoding a polypeptide having a (3-glucosidase catalytic 
activity, thereby increasing the level of the free glucose in the plant. 

47. The method of claim 46, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said p-glucosidase catalytic 
activity. 

48. The method of claim 46, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

49. A method of increasing a level of free glucose in a plant, the 
method comprising the step of overexpressing in the plant a nucleic acid 
construct including a genomic, complementary or composite polynucleotide 
being derived from Aspergillus niger, encoding a polypeptide having a 0- 
glucosidase catalytic activity, thereby increasing the level of the free glucose in 
the plant. 

50. The method of claim 49, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said (5-glucosidase catalytic 
activity. 

51. The method of claim 49, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1 , 3 or a portion thereof. 

52. A method of producing an alcohol, the method comprising the 
step of fermenting a glucose containing fermentation starting material by a cell 
overexpressing a nucleic acid construct including a genomic, complementary 
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or composite polynucleotide being derived from Aspergillus niger, encoding a 
polypeptide having a P-glucosidase catalytic activity, and extracting the 
alcohol therefrom. 



53. The method of claim 52, wherein said cell is selected from the 
group consisting of a prokaryotic cell and a eukaryotic cell. 

54. The method of claim 53, wherein said prokaryotic cell is E. coli. 

55. The method of claim 53, wherein said eukaryotic cell is selected 
from the group consisting of a yeast cell, a fungous cell, a plant cell and an 
animal cell. 

56. The method of claim 52, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said P-glucosidase catalytic 
activity. 

57. The method of claim 52, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

58. A method of producing an alcohol, the method comprising the 
step of fermenting a plant derived glucose containing fermentation starting 
material by a cell, said plant overexpressing a nucleic acid construct including 
a genomic, complementary or composite polynucleotide being derived from 
Aspergillus niger, encoding a polypeptide having a P-glucosidase catalytic 
activity, and extracting the alcohol therefrom. 

59. The method of claim 58, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said P-glucosidase catalytic 
activity. 
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60. The method of claim 58, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

61. A method of producing an aroma spreading plant, the method 
comprising the step of overexpressing in the plant a nucleic acid construct 
including a genomic, complementary or composite polynucleotide being 
derived from Aspergillus niger, encoding a polypeptide having a p-glucosidase 
catalytic activity, thereby increasing aroma spread from the plant. 

62. The method of claim 61, wherein overexpressing said nucleic 
acid construct is performed in a tissue specific manner. 

63. The method of claim 61, wherein overexpressing said nucleic 
acid construct is limited to at least one tissue selected from the group 
consisting of flower, fruit, seed, root, stem, pollen and leaves. 

64. The method of claim 61, wherein said polypeptide is as set forth 
in SEQ ID NO: 2 or a portion thereof having said (3-glucosidase catalytic 
activity. 

65. The method of claim 61, wherein said polynucleotide is as set 
forth in SEQ ID NOs: 1, 3 or a portion thereof. 

66. An isolated nucleic acid comprising a genomic, complementary 
or composite polynucleotide encoding a polypeptide having a p-glucosidase 
catalytic activity and further encoding a signal peptide in frame with said 
polypeptide. 

67. The isolated nucleic acid of claim 66, further encoding an 
endoplasmic retaining sequence in frame with said polypeptide. 
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68. A nucleic acid construct comprising the isolated nucleic acid of 
claim 66. 



69. The nucleic acid construct of claim 68, further comprising at 
least one cis acting control element for regulating expression of said 
polynucleotide. 

70. A host cell comprising the nucleic acid construct of claim 68. 

71 . The host cell of claim 70, wherein the host cell is of a plant. 

72. The host cell of claim 70, wherein the host cell is of a yeast. 

73. The host cell of claim 70, wherein the host cell is of an animal. 



74. An organism comprising the isolated nucleic acid of claim 66 as 
an inter or intra chromosomal genetic element. 

75. The organism of claim 74, wherein the organism is a plant. 



76. A recombinant protein comprising a polypeptide having a (3- 
glucosidase catalytic activity and a signal peptide fused thereto. 

77. The recombinant protein of claim 76, further comprising an 
endoplasmic reticulum retaining peptide fused thereto. 



78. A method of producing recombinant p-glucosidase, the method 
comprising the step of introducing, in an expressible form, a nucleic acid 
construct into a host cell, said nucleic acid construct including a genomic, 
complementary or composite polynucleotide encoding a polypeptide having a 
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(3-glucosidase catalytic activity and a signal peptide being in frame with said 
polypeptide. 

79. The method of claim 78, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

80. The method of claim 78, further comprising the step of 
extracting said polypeptide having said (3-glucosidase catalytic activity. 

81. The method of claim 78, wherein said host cell is a plant cell. 

82. The method of claim 78, wherein said host cell is an animal cell. 

83. The method of claim 78, wherein said host cell is a yeast cell. 

84. A method of producing a recombinant (3-glucosidase 
overexpressing cell, the method comprising the step of introducing, in an 
overexpressible form, a nucleic acid construct into a host cell, said nucleic acid 
construct including a genomic, complementary or composite polynucleotide 
encoding a polypeptide having a (3-glucosidase catalytic activity and further 
encoding a signal peptide being in frame with said polypeptide. 

85. The method of claim 84, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

86. The method of claim 84, wherein said host cell is a plant cell. 
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The method of claim 84, wherein said host cell is a yeast cell. 

88. The method of claim 84, wherein said host cell is an animal cell. 

89. A method of increasing a level of at least one fermentation 
substance in a fermentation product, the method comprising the step of 
fermenting a plant derived glucose containing fermentation starting material by 
a yeast cell, said plant overexpressing a nucleic acid construct including a 
genomic, complementary or composite polynucleotide encoding a polypeptide 
having a p-glucosidase catalytic activity and further encoding a signal peptide 
being in frame with said polypeptide, thereby increasing the level of the at 
least one fermentation substance in the fermentation product. 

90. The method of claim 89, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

91. A method of increasing a level of at least one aroma substance in 
a plant derived product, the method comprising the step of incubating a 
glucose containing plant starting material with a yeast cell, said plant 
overexpressing a nucleic acid construct including a genomic, complementary 
or composite polynucleotide encoding a polypeptide having a p-glucosidase 
catalytic activity and further encoding a signal peptide being in frame with said 
polypeptide, thereby increasing the level of the at least one aroma substance in 
the plant derived product. 

92. The method of claim 91, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 



WO 01/36586 

87. 
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93. The method of claim 91, wherein said plant derived product is a 
fermentation product. 

94. A method of increasing a level of free glucose in a glucose 
containing fermentation starting material, the method comprising the step of 
fermenting the glucose containing fermentation starting material by a cell 
overexpressing a nucleic acid construct including a genomic, complementary 
or composite polynucleotide encoding a polypeptide having a (3-glucosidase 
catalytic activity and further encoding a signal peptide being in frame with said 
polypeptide, thereby increasing the level of the free glucose in the glucose 
containing fermentation starting material. 

95. The method of claim 94, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

96. A method of increasing a level of free glucose in a plant derived 
glucose containing fermentation starting material, the method comprising the 
step of fermenting the plant derived glucose containing fermentation starting 
material by a cell, said plant overexpressing a nucleic acid construct including 
a genomic, complementary or composite polynucleotide encoding a 
polypeptide having a (3-glucosidase catalytic activity and further encoding a 
signal peptide being in frame with said polypeptide, thereby increasing the 
level of the free glucose in the plant. 

97. The method of claim 96, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 
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98. A method of increasing a level of free glucose in a plant, the 
method comprising the step of overexpressing in the plant a nucleic acid 
construct including a genomic, complementary or composite polynucleotide 
encoding a polypeptide having a p-glucosidase catalytic activity and further 
encoding a signal peptide being in frame with said polypeptide, thereby 
increasing the level of the free glucose in the plant. 

99. The method of claim 98, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

100. A method of producing an alcohol, the method comprising the 
step of femienting a glucose containing fermentation starting material by a cell 
overexpressing a nucleic acid construct including a genomic, complementary 
or composite polynucleotide encoding a polypeptide having a P-glucosidase 
catalytic activity and further encoding a signal peptide being in frame with said 
polypeptide, and extracting the alcohol therefrom. 

101. The method of claim 100, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

102. A method of producing an alcohol, the method comprising the 
step of fermenting a plant derived glucose containing fermentation starting 
material by a cell, said plant overexpressing a nucleic acid construct including 
a genomic, complementary or composite polynucleotide encoding a 
polypeptide having a p-glucosidase catalytic activity and further encoding a 
signal peptide being in frame with said polypeptide, and extracting the alcohol 
therefrom. 
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103. The method of claim 102, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

104. A method of producing an aroma spreading plant, the method 
comprising the step of overexpressing in the plant a nucleic acid construct 
including a genomic, complementary or composite polynucleotide encoding a 
polypeptide having a p-glucosidase catalytic activity and further encoding a 
signal peptide being in frame with said polypeptide, thereby increasing aroma 
spread from the plant. 

105. The method of claim 104, wherein said polynucleotide further 
encodes an endoplasmic reticulum retaining peptide being in frame with said 
polypeptide. 

106. The method of claim 104, wherein overexpressing said nucleic 
acid construct is performed in a tissue specific manner. 

107. The method of claim 104, wherein overexpressing said nucleic 
acid construct is limited to at least one tissue selected from the group 
consisting of flower, fruit, seed, root, stem, pollen and leaves. 
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1 TCCATTCG CCCATGCTT AGCGTGTCT T TTCTTTGAACAO fGCATGCGGGACTGTGAATTG BO 

61 CATGAGTGGGTAGCTlTGCGGAGACAGCroCACTGGCA 12 Q 

121 ATTCGi^TGCCGTGGCGGACGGTCACTTTGTGGCGCT 180 

181 TCCCCTTTCTCTCGCTGTTTTCGTTTCTGTOCTCCCT 240 

241 CAGGTGTTGCACGGTTGCTCACCTGGTTTGTTTTGCTCCXCCT 300 

301 CATGAGGTTCACTTTGATCGAGGCGGTGGCTCTGACTG<XCT 360 
MetArgPheThrLeuIleGluAlaValAlaLeuThrAlaValSerLevAlaSerAls 

Signal Peptide 

361 ACGTGCCXSTTACTTTCTCCTCAGAATTGCAAT^ 420 

Intronll 

421 TCJVTCATCGCTGACAATG6TCTTTTCATAGG ATGAATTGGCCT 4 80 

AspGl uLeuAl aTyr Ser ProProTy rT yir 

481 Ct^TCCCCTTGGGCCAATGGCCAGGGCGACTGGGCG 540 
ProSerProTrpAlaAsnGlyGlnGlyAs pTrpAlaGlnAlaTyrGlnAraAlaValAsp 

541 ATTGTCTCGCAAATGACATTGGAT(^GAAGGTCAATCTGACCACAGGAACTGG CT 600 
1 1 eVa 1 Se rGl nMe t Th r LeuAs pG 1 u Ly s Va 1 As n LeuTh rTh rG 1 yTh rG 1 y 

60 1 TTACATGGCGCAATCTGTATGCTCCGGCTAACAACTTCTA 660 

Intront2 TrpGluLeuGluLeuCys 

661 GTTGGTCAGACTGGCGGTGTTCCCCGG TAGGTTTGAAAATATT^ 7 20 

ValGlyGlnThrGlyGlyValProArg Intron§3 

721 TATTGATTAACGGTGACAGATTGGGAGTTCCGGGAATGTGTTT^ 780 

LeuGl yVa 1 ProGl yMe tCysLeuGlnAspSer ProLeuG 

781 GCGTTCGCGACT GTAAGCCATCTGCTGTTGTTAGGCTTCGATGCTCTTACTGACACGGCG 8 40 
lyValArgAspS Introo#4 

841 CAGCCGACTACAACTCTGCTTTCCCT^ 900 
erAspTyrAsnSerAlaPheProAlaGlyHetAsnValAlaAlaThrTrpAspLysA 

901 AT CTGGCAT ACCT TCGCGGCAAGGCT ATGGGTCAGGAATTT AGTG ACAAG^ CG A T A 960 
snLeuAl aTy r LeuArgGl yLy s AlaMe t Gl yGlnGluPheSe r AspLy s Gl yAl aAspl 

961 TCCAATTGGGTCCAGCTGCCGGCCCTCTCGOT 1020 
leG InLeuG 1 y Pr oAl aAlaGl y Pr oLeuG 1 yArg Se r Pr oAspGl yGl yArg As nT rpG 

1021 AGGGCTTCTCCCCAGACCCTGCCCTAAGT^ 1080 
luGlyPheSerProAspProAlaLeuSerGlyValLeuPheAlaGluThrlleLysGlyl 

leGlnAspAlaGlyValValAlaThrAlaLysHisTyrlleAlaTyrGluGlnGluHisP 

1141 TCam»GGCGC<nX3AAGCCCAAGGTTTTGGATW 1200 
heArgGlnAlaProGluAlaGlnGlyPheGlyPheAsnlleSerGluSerGlySerAlaA 

1201 ACCTCGACGATAAGACTATGCJW3^GCTGTACCTCTt3GCC^ 1260 
snLeuAspAspLysTbrHetllisGluLeuTyrLeuTrpProPbeAlaAspAlalleArgA 

1261 CAGGTGCTGGOGCTGTGATGTGCTCCTACAACCAGATCAACAACAGTTA 1320 
laGlyAlaGlyAlaValMetCysSerTyrAsnGlnlleAsnAsnSerTyrGlyCysGlnA 

1321 ACAGCTACACTCTGAACAAGCTGCTCAAGGCCGAGCTG^ 1380 
saSerTyrThrLeuA3nLysLeuLeuLysAlaGluLeuGlyPheGlnGlyPheVa LMet S 



Fig. 5a 
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1381 GTGACT GG GCT G CT C ACCATGCTCGTG 14 40 

erAspTrpAlaAlaHisHisAlaGlyValSerGlyAlaLeuAlaGlyLeuAspMetSerM 

1441 TGCCAGGAGACGTO^CTACGACAGTGGTACOTCTT 1500 
etProGlyAspValAspTyrAspSerGlyThrSerTyrTrpGlyThrAsnLeuThrlleS 

1501 GCGTGCTCAACX5GAACGGTGCCCCAATGGCGT^ 1560 
erValLeuAsnGlyThrValProGlnTrpArgValAspAspMetAlaValArglleMetA 

1561 CCGCCTACTACAAGGTCGGCX^GACCGTCTGTGGACTCCrc 1 620 

laAlaTyrTyrLysValGlyArgAspArgLeuTrpThrProProAsnPheSerSerTrpT 

1621 CCAGAGATGAATACGGCTACAAGTACTACTACGTGTCGGAGGGACCGTACGAGAAOT 1680 
hrArgAspGluTyrGlyTyrLysTyrTyrTyrValSerGluGlyProTyrGluLysValA 

1681 ACCAGTACGTGAATGTGCAACGCAACCACMCGAACT^ 1740 
snGlnTyrValAsnValGlnArgAsnHisSerGluLeuIleArgArgXleGlyAlaAspS 

1741 GCACGGTGCTCCTCAAGAACGJVCGGCGCTCTGCCTrrGACTC 1800 
erThrValLeuLeuLysAsnAspGlyAlaLeuProLeuThrGlyLysGluArgbeuValA 

1801 CGCTTATCGGAGAAGATGCGGGCTCCAACCCTTATGGTGCCAAC^ 1860 
laLeuIleGlyGluAspAlaGlySerAsnProTyrGlyAlaAsnGlyCysSerAspArgG 

1861 GATGCGACAATGGAACATTGGCGATGGGCTGGGGAAGTGGTACT^ 1920 
lyCysAspAsnGlyThrLeuAlaMetGlyTrpGlySerGlyThrAlaAsnPbeProTyrL 

1921 TGGTGACCCCCGAGCAGGCCATCTCAAACGAGGTGCTTAAGC^ 1980 
euValThrProGluGlnAIalleSerAanGlu ValLeuLysHisLysAsnGlyValPheT 

1981 CCGCCACCGATAACTGGGCTATCGATCAGATTGAGGCGCTT 204 0 

hrAlaThrAspAsnTrpAlalleAspGlnlleGluAlaLeuAlaLysThrAlaArq 

2041 AAGATCCCCGATriTTTiCCTTCTTGTGCAATGGA 2100 

IntronfS ValSerL 

2101 TTGTCTTTGTCAACGCCGACTCTGGTGAGGGTTACATCAAT 2160 
euValPheValAsnAlaAspSerGlyGluGlyTyrlleAsnValAspGlyAsnLeuGlyA 

2161 ACCGCAGGAACCTGACCCTGTGGAGGAACCGCGAT/^TGTG^ 2220 
spArgArgAsnLeuThrLeuTrpArgAsnArgAspAsnVall leLysAlaAlaAlaSerA 

2281 ACCACAACCCCAATGTTACCGCTATCCTCTGGGGTGGTTTGCCCGC 2 34 0 

snHisAsnProAsnValThrAlalleLeuTrpGlyGlyLeuProGlyGlnGluSerGlyA 

2341 ACTCTCTTGCCGACGTCCTCTATGGCCGTGTCAACCCCGGTGCXAACT 2 4 00 

3nSerLeuAlaAspValLeUTyrGlyArgValAsnProGlyAlaLysSerProPheThrT 

2401 GGGGCAAGACTCGTGAGGCCTACCAAGACTACTTGGTCACCGAGCCCAACAACGGCAACG 24 60 
rpGlyLysThrArgGlxiAlaTyrGlnAspTyrLeuValThrGluProAsnAsnGlyAsnG 

2461 GJVGCCCCTCAGGAAGACTTTGTCGAGGGCGTCTTCATTGACTACCG^ 2520 
lyAlaProGlnGluAspPheValGluGlyValPhelleAspTyrArgGXyPheAspLysA 

2521 GCAACGAGACCCCGATCTACGAGTTCGGCTATGGTCTGAGCTACGCCACTTTCAACTACT 2580 
rgAsnGluThrProlleTyrGluPheGlyTyrGlyLeuSerTyrAlaThrPheAsnTyrS 

2581 CGAACCTTGAGGTGCAGGTGCTGAGCGCCXrCTGCATACGAGCCTGCT^ 2640 
erAsaLeuGluValGlnValLeuSerAlaProAlaTyrGluProAlaSerGlyGlaThrG 

2701 TGCAGAGAATTACCAAGTTCATCTACCCCTGGCTCAACGCT 2760 
luGlnArglleThrLysPhelleTyrProTrpLeuAsnGlyThrAspLeuGluAlaSerS 

2761 CCGGGGATGCTAGCTACGGGCAGGACTCCTCCGACTATCTTCCCGAGGGAGCCACCGATG 2820 
ArClYAapAlaSerTvrGlvGlnAspSerSerAspTyrLeuProGlu GlyAlaThrAspG 

Fig. 5a (continued) 
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2B21 GCTCTCXGCAACCGATCCTGCCTGCC^^ 2880 
lySerAlaGlnProIleLeuProAlaGlyGlyGlyProGlyGlyAsnPro AraLeuTvrA 

2881 ACGAGCTCATCCGCGTGTCAGTGACCATOVAGAACACCGGC^ 294 0 
SpGluLeuIleArgValSerValThrlleLysAsnThrGlyLysValAlaGlyAspGluV 

2941 rr(XCCAACTGGTAAGTAAACATGAGGTCCGAACGAGGTTGAA 3000 
alProGlnLeu Infcron§6 

3001 AOTATGTTTCCCTTGGCGGTCCCAATGAGCCCAAGATCGTGCTGCGTCAATTCGAGCGCA 30 60 
TyrValSerLeuGlyGlyProAsnGluProLysIleValLeuArgGlnPheGluArgl 

3061 TCACGCTGCAGCCGTCGGAGGAGACGAAGTGGAGCACGACTCTGACGCGCCGTGACCTTG 3120 
leThrLeuGlnProSerGluGluThrLysTrpSerThrThrLeuThrArgArgAspLeuA 

3121 CAAACTGGAATGTTGAGAAGCAGGACTGGGAGATTACGTCGTATC 3180 
laAsnTrpAsnValGluLysGlnAspTrpGluIleThrSerTyrProLysMetValPheV 

3181 TCGGAAGCTCCTCGCGGAAGCTGCCGCTCCGGGCGTCT 3 2 4 O 
alGlySerSerSerArgLysLeuProLeuArgAlaSerLeuProThrValHis* ** 

3241 CTCTCAAATGGTATACCATGATGGCCGTGGTATATGAATTAATGATTTATGCCAACAGCA 330O 

3301 AGACCACTGTAGATGTAGATGTAGAATGAGTATTGCGTAGTAGCGTGTAGATGATGATAC 3360 

3361 AAGCGATCCGACACATGGTAGGAAGAGTGGCGCTAGTTGGGGCGGAAACCAAGCGACGTC 3420 

3421 ATCCCCrGCCGACTTCGCCAGTCTn xr rT ^ ^^ 3480 

3481 ATCCAGCAACCATTGCCAATTGCCTCTACAACAACTAATTGCCATAATACTCTACTCCTA 354 0 

3541 TTCAATATATACACCACi^TCTCGACATAATCACACAAGCCTGAACACACGAGCAACCAT 3600 

3601 GCCCTCTCCCGATCCTCCAGCCCCAGCGATACGACCCTTCCAACCACCCATAACAGCGCT 3660 

3661 CCTCATCTACCCAGCGACCCTAATCGTGGGATCACTCTTCTCCGTCCTCTCTCCCACCGC 372 0 

3721 ACAAGGCACACGCGACGACGGCTCCAGCACCCTCCACCCACACGTCGAGCCCCTAGCCCC 3780 

3781 GTCCATCGCGTCAG ACCTC AACCTCTCCTTTCCTCCGCCGCGCCCCGTCAACTACTTCGC 384 0 

3841 TCGCAAAGACAACATCTTCAATCTATATTCGTCAAAGTCGGC 388 5 
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SEQUENCE LISTING 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2583 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 

<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGAGGTTCA CTTTGATCGA GGCGGTGGCT CTGACTGCCG TCTCGCTGGC 50 

CAGCGCTGAT GAATTGGCCT ACTCCCCACC GTATTACCCA TCCCCTTGGG 100 

CCAATGGCCA GGGCGACTGG GCGCAGGCAT ACCAGCGCGC TGTTGATATT 150 

GTCTCGCAAA TGACATTGGA TGAGAAGGTC AATCTGACCA CAGGAACTGG 200 

ATGGGAATTG GAACTATGTG TTGGTCAGAC TGGCGGTGTT CCCCGATTGG 250 

GAGTTCCGGG AATGTGTTTA CAGGATAGCC CTCTGGGCGT TCGCGACTCC 300 

GACTACAACT CTGCTTTCCC TGCCGGCATG AACGTGGCTG CGACCTGGGA 350 

CAAGAATCTG GCATACCTTC GCGGCAAGGC TATGGGTCAG GAATTTAGTG 4 00 

ACAAGGGTGC CGATATCCAA TTGGGTCCAG CTGCCGGCCC TCTCGGTAGA 4 50 

AGTCCCGACG GTGGTCGTAA CTGGGAGGGC TTCTCCCCAG ACCCTGCCCT 500 

AAGTGGTGTG CTCTTTGCCG AGACCATCAA GGGTATCCAA GATGCTGGTG 550 

TGGTTGCGAC GGCTAAGCAC TACATTGCTT ACGAGCAAGA GCATTTCCGT 600 

CAGGCGCCTG AAGCCCAAGG TTTTGGATTT AATATTTCCG AGAGTGGAAG 650 

TGCGAACCTC GATGATAAGA CTATGCACGA GCTGTACCTC TGGCCCTTCG 7 00 

CGGATGCCAT CCGTGCAGGT GCTGGCGCTG TGATGTGCTC CTACAACCAG 750 

ATCAACAACA GTTATGGCTG CCAGAACAGC TACACTCTGA ACAAGCTGCT 800 

CAAGGCCGAG CTGGGCTTCC AGGGCTTTGT CATGAGTGAT TGGGCTGCTC 8 50 

ACCATGCTGG TGTGAGTGGT GCTTTGGCAG GATTGGATAT GTCTATGCCA 900 

GGAGACGTCG ACTACGACAG TGGTACGTCT TACTGGGGTA CAAACTTGAC 950 

CATTAGCGTG CTCAACGGAA CGGTGCCCCA ATGGCGTGTT GATGACATGG 1000 

CTGTCCGCAT CATGGCCGCC TACTACAAGG TCGGCCGTGA CCGTCTGTGG 1050 

ACTCCTCCCA ACTTCAGCTC ATGGACCAGA GATGAATACG GCTACAAGTA 1100 

CTACTACGTG TCGGAGGGAC CGTACGAGAA GGTCAACCAG TACGTGAATG 1150 

TGCAACGCAA CCACAGCGAA CTGATTCGCC GCATTGGAGC GGACAGCACG 1200 

GTGCTCCTCA AGAACGACGG CGCTCTGCCT TTGACTGGTA AGGAGCGCCT 1250 

GGTCGCGCTT ATCGGAGAAG ATGCGGGCTC CAACCCTTAT GGTGCCAACG 1300 

GCTGCAGTGA CCGTGGATGC GACAATGGAA CATTGGCGAT GGGCTGGGGA 1350 

AGTGGTACTG CCAACTTCCC ATACCTGGTG ACCCCCGAGC AGGCCATCTC 1400 

AAACGAGGTG CTTAAGCACA AGAATGGTGT ATTCACCGCC ACCGATAACT 14 50 

GGGCTATCGA TCAAATTGAG GCGCTTGCTA AGACCGCCAG TGTCTCTCTT 1500 

GTCTTTGTCA ACGCCGACTC TGGTGAGGGT TACATCAATG TGGACGGAAA 1550 

CCTGGGTGAC CGCAGGAACC TGACCCTGTG GAGGAACGGC GATAATGTGA 1600 

TCAAGGCTGC TGCTAGCAAC TGCAACAACA CAATCGTTGT CATTCACTCT 1650 

GTCGGACCAG TCTTGGTTAA CGAGTGGTAC GACAACCCCA ATGTTACCGC 1700 

TATCCTCTGG GGTGGTTTGC CCGGTCAGGA GTCTGGCAAC TCTCTTGCCG 17 50 

ACGTCCTCTA TGGCCGTGTC AACCCCGGTG CCAAGTCGCC CTTTACCTGG 1800 

GGCAAGACTC GTGAGGCCTA CCAAGACTAC TTGGTCACCG AGCCCAACAA 1850 

CGGCAACGGA GCCCCTCAGG AAGACTTTGT CGAGGGCGTC TTCATTGACT 1900 

ACCGTGGATT TGACAAGCGC AACGAGACCC CGATCTACGA GTTCGGCTAT 1950 

GGTCTGAGCT ACACCACTTT CAACTACTCG AACCTTGAGG TGCAGGTGCT 2000 

GAGCGCCCCT GCATACGAGC CTGCTTCGGG TGAGACCGAG GCAGCGCCAA 2050 

CCTTCGGAGA GGTTGGAAAT GCGTCGGATT ACCTCTACCC CAGCGGATTG 2100 

CTGAGAATTA CCAAGTTCAT CTACCCCTGG CTCAACGGTA CCGATCTCGA 2150 

GGCATCTTCC GGGGATGCTA GCTACGGGCA GGACTCCTCC GACTATCTTC 2200 

CCGAGGGAGC CACCGATGGC TCTGCGCAAC CGATCCTGCC TGCCGGTGGC 2250 

GGTCCTGGCG GCAACCCTCG CCTGTACGAC GAGCTCATCC GCGTGTCAGT 2300 

GACCATCAAG AACACCGGCA AGGTTGCTGG TGATGAAGTT CCCCAACTGT 2350 

ATGTTTCCCT TGGCGGTCCC AATGAGCCCA AGATCGTGCT GCGTCAATTC 24 00 

GAGCGCATCA CGCTGCAGCC GTCGGAGGAG ACGAAGTGGA GCACGACTCT 2 4 50 

GACGCGCCGT GACCTTGCAA ACTGGAATGT TGAGAAGCAG GACTGGGAGA 2 500 

TTACGTCGTA TCCCAAGATG GTGTTTGTCG GAAGCTCCTC GCGGAAGCTG 2550 

CCGCTCCGGG CGTCTCTGCC TACTGTTCAC TAA 258 3 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 860 

{B) TYPE: amino acid 

(C) STRAN: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Arg Phe Thr Leu lie Glu Ala Val Ala Leu Thr Ala Val Ser 

5 10 15 
Leu Ala Ser Ala Asp Glu Leu Ala Tyr Ser Pro Pro Tyr Tyr Pro 

20 25 30 
Ser Pro Trp Ala Asn Gly Gin Gly Asp Trp Ala Gin Ala Tyr Gin 

35 40 45 
Arg Ala Val Asp He Val Ser Gin Met Thr Leu Asp Glu Lys Val 

50 55 60 
Asn Leu Thr Thr Gly Thr Gly Trp Glu Leu Glu Leu Cys Val Gly 

65 70 " 75 
Gin Thr Gly Gly val Pro Arg Leu Gly Val Pro Gly Met Cys Leu 
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• 




80 


Gin 


Asp 


Ser 


Pro 


Leu 
95 


Phe 


Pro 


Ala 


Gly 


Met 
110 


Ala 


Tyr 


Leu 


Arg 


Glv 
125 


Gly 


Ala 


Asp 


He 


Gin 
140 


Ser 


Pro 


Asp 


Gly 


Glv 
155 


Ala 


Leu 


Ser 


Glv 


Val 
170 


Asp 


Ala 


Glv 


Val 


Val 
185 


Gin 


Glu 


His 


Phe 


Arg 
200 


Asn 


He 


Ser 


Glu 


Ser 
215 


His 


Glu 


Leu 


Tyr 


Leu 
230 


Ala 


Gly 


Ala 


Val 


Met 
245 


Gly 


Cys 


Gin 


Asn 


Ser 
2 60 


Leu 


Gly 


Phe 


Gin 


Glv 

j. jr 

275 


Ala 


Gly 


Val 


Ser 


Glv 
290 


Gly 


Asp 


Val 


Asd 

nap 


Tvr 
j 

305 


Leu 


Thr 


He 


Ser 


Val 
320 


Asd 


Asd 


Met 


Ala 


Val 
335 


Arg 


Asp 


Arq 


Leu 


Trn 

350 


Asp 


Glu 


Tvr 


Glv 


Tvr 
365 


Glu 


Lys 


Val 


Asn 


Gin 
380 


Leu 


He 


Arg 


Aro 


He 
395 


Asp 


Gly Ala 


Le u 


Pro 










410 


He 


Gly Glu 


Asp 


Ala 










425 


Ser 


Asp 


Arg 


Glv 


J 9 

440 


Ser 


Gly 


Thr 


Ala 


Asn 
455 


He 


Ser 


Asn 


Glu 


val 
470 


Thr 


Asp 


Asn 


Trp 


Ala 
485 


Ala 


Ser 


Val 


Ser 


Leu 
500 


Tyr 


He 


Asn 


Val 


Asd 
515 


Leu 


Trp Arg 


Asn 


Glv 










530 


Cys 


Asn 


Asn 


Thr 


He 
545 


Val 


Asn 


Glu 


Tro 


Tvr 
560 


Gly 


Gly 


Leu 


Pro 


Glv 
575 


Leu 


Tyr 


Gly 


Ara 


Val 
590 


Gly 


Lys 


Thr 


Aro 


Gl»u 
605 


Asn 


Asn 


Gly 


Asn 


Glv 
620 


Phe 


He 


Asp 


Tyr 


Arg 
635 


Tyr 


Glu 


Phe 


Gly 


Tyr 
650 


Asn 


Leu 


Glu 


Val 


Gin 
665 


Ser 


Gly 


Glu 


Thr 


Glu 











85 


Gly 


val 


Arg 


Asp 


Ser 
100 


Asn 


Val 


Ala 


Ala 


Thr 
115 


Lys 


Ala 


Met 


Gly 


Gin 
130 


Leu 


Gly 


Pro 


Ala 


Ala 
145 


Arg 


Asn 


Trp 


Glu 


Gly 
160 


Leu 


Phe 


Ala 


Glu 


Thr 
175 


Ala 


Thr 


Ala 


Lys 


His 
190 


Gin 


Ala 


Pro 


Glu 


Ala 
205 


Gly 


Ser 


Ala 


Asn 


Leu 
220 


Trp 


Pro 


Phe 


Ala 


Asp 
235 


Cys 


Ser 


Tyr 


Asn 


Gin 
250 


Tyr 


Thr 


Leu 


Asn 


Lys 
265 


Phe 


Val 


Met 


Ser 


Asp 
280 


Ala 


Leu 


Ala 


Gly 


Leu 
295 


Asp 


Ser 


Gly 


Thr 


Ser 
310 


Leu 


Asn 


Gly 


Thr 


Val 
325 


Arg 


He 


Met 


Ala 


Ala 
340 


Thr 


Pro 


Pro 


Asn 


Phe 

355 


Lys 


Tyr 


Tyr 


Tyr 


Val 
370 


Tyr 


Val 


Asn 


Val 


Gin 
385 


Gly 


Ala 


Asp 


Ser 


Thr 

400 


Leu 


Thr 


Gly 


Lys 


Glu 
415 


Gly 


Ser 


Asn 


Pro 


Tyr 
430 


Asp 


Asn 


Gly 


Thr 


Leu 
445 


Phe 


Pro 


Tyr 


Leu 


Val 
460 


Leu 


Lys 


His 


Lys 


Asn 
475 


He 


Asp 


Gin 


He 


Glu 
490 


Val 


Phe 


Val 


Asn 


Ala 
505 


Gly 


Asn 


Leu 


Gly 


Asp 
520 


Asp 


Asn 


Val 


He 


Lys 
535 


Val 


Val 


He 


His 


Ser 
550 


Asp 


Asn 


Pro 


Asn 


val 
565 


Gin 


Glu 


Ser 


Gly 


Asn 
580 


Asn 


Pro 


Gly 


Ala 


Lys 
595 


Ala 


Tyr 


Gin 


Asp 


Tyr 
610 


Ala 


Pro 


Gin 


Glu 


Asp 
625 


Gly 


Phe 


Asp 


Lys 


Arg 
640 


Gly 


Leu 


Ser 


Tyr 


Thr 








655 


Val 


Leu 


Ser 


Ala 


Pro 
670 


Ala 


Ala 


Pro 


Thr 


Phe 



2 











qn 


Asp 


Tyr Asn 


Ser 


Ala 










105 


Trp 


Asp 


Lys 


Asn 


Leu 
120 


Glu 


Phe 


Ser 


Asp 


Lys 
135 


Gly 


Pro 


Leu 


Gly Arg 












Phe 


Ser 


Pro 


Asp 


Pro 

1 


He 


Lys 


Gly 


He 


Gin 
t fin 

-L o u 


Tyr 


He 


Ala 


Tyr 


Glu 
1 

± y d 


Gin 


Gly 


Phe 


Gly 


Phe 


Asp 


Asp 


Lys 


Thr 


Met 

9 7 S 


Ala 


He 


Arg 


Ala 


Gly 
o a n 


He 


Asn 


Asn 


Ser 


Tyr 


Leu 


Leu 


Lys 


Ala 


Glu 

OTA 


Trp 


Ala 


Ala 


His 


His 
o o c 


Asp 


Met 


Ser 


Met 


Pro 

J UU 


Tyr 


Trp Gly 


Thr 


Asn 










315 


Pro 


Gin 


Trp 


Arg 


Val 
330 


Tyr 


Tyr 


Lys 


Val 


Gly 
345 


Ser 


Ser 


Trp 


Thr 


Arg 
360 


Ser 


Glu 


Gly 


Pro 


Tyr 
375 


Arg 


Asn 


His 


Ser 


Glu 
390 


val 


Leu 


Leu 


Lys 


Asn 
405 


Arg 


Leu 


Val 


Ala 


Leu 
420 


Gly 


Ala 


Asn 


Gly Cys 










435 


Ala 


Met 


Gly 


Trp 


Gly 
450 


Thr 


Pro 


Glu 


Gin 


Ala 
465 


Gly 


Val 


Phe 


Thr 


Ala 
480 


Ala 


Leu 


Ala 


Lys 


Thr 
495 


Asp 


Ser 


Gly 


Glu 


Gly 
510 


Arg 


Arg 


Asn 


Leu 


Thr 
525 


Ala 


Ala 


Ala 


Ser 


Asn 
540 


Val 


Gly 


Pro 


val 


Leu 
555 


Thr 


Ala 


He 


Leu 


Trp 
570 


Ser 


Leu 


Ala 


Asp 


Val 
585 


Ser 


Pro 


Phe 


Thr 


Trp 
600 


Leu 


Val 


Thr 


Glu 


Pro 
615 


Phe 


val 


Glu 


Gly 


Val 
630 


Asn 


Glu 


Thr 


Pro 


He 
645 


Thr 


Phe 


Asn 


Tyr 


Ser 
660 


Ala 


Tyr 


Glu 


Pro 


Ala 
675 


Gly 


Glu 


Val 


Gly 


Asn 
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680 685 690 



Ala 


Ser 


Asp 


Tyr 


Leu 


Tyr 


Pro 


Ser 


Gly 


Leu 


Leu 


Arg 


He 


Thr Lys 










695 










700 






705 


Phe 


He 


Tyr 


Pro 


Trp 


Leu 


Asn 


Gly 


Thr 


Asp 


Leu 


Glu 


Ala 


Ser Ser 










7 10 










715 








720 


Gly 


Asp 


Ala 


Ser 


Tyr 


Gly Gin 


Asp 


Ser 


Ser 


Asp 


Tyr 


Leu 


Pro Glu 










725 










730 




735 


Gly Ala 


Thr 


Asp 


Gly 


Ser 


Ala 


Gin 


Pro 


He 


Leu 


Pro 


Ala 


Gly Gly 










7 40 










745 








750 


Gly 


Pro 


Gly 


Gly 


Asn 


Pro Arg 


Leu 


Tyr 


Asp 


Glu 


Leu 


He 


Arg Val 










755 










760 








765 


Ser 


val 


Thr 


He 


Lys 


Asn 


Thr 


Gly 


Lys 


val 


Ala 


Gly 


Asp 


Glu Val 










770 










775 




780 


Pro 


Gin 


Leu 


Tyr 


Val 


Ser 


Leu 


Gly 


Gly 


Pro 


Asn 


Glu 


Pro 


Lys He 










785 










790 








795 


Val 


Leu 


Arg 


Gin 


Phe 


Glu 


Arg 


He 


Thr 


Leu 


Gin 


Pro 


Ser 


Glu Glu 










800 










805 








810 


Thr 


Lys 


Trp 


Ser 


Thr 


Thr 


Leu 


Thr 


Arg 


Arg 


Asp 


Leu 


Ala 


Asn Trp 










815 










820 








825 


Asn 


Val 


Glu 


Lys 


Gin 


Asp 


Trp 


Glu 


He 


Thr 


Ser 


Tyr 


Pro 


Lys Met 










830 










835 






840 


Val 


Phe 


val 


Gly 


Ser 


Ser 


Ser 


Arg 


Lys 


Leu 


Pro 


Leu 


Arg 


Ala. Ser 










845 










850 






855 


Leu 


Pro 


Thr 


Val 


His 





















860 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3885 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TCCATTCGCC CATGCTTAGC GTGTCTTTTC TTTGAACACT GCATGCGGGA 50 

CTGTGAATTG CATGAGTGGG TAGCTTTGCG GAGACAGCTG CACTGGCATA 100 

CATCATCGTT GGGTTCCTCA ATTCGCATGC CGT GGCGGAC GGTCACTTTG 150 

TGGCGCTCAA ACTATTTAAT ATGGCCCAGC TCCCCTTTCT CTCGCTGTTT 200 

TCGTTTCTGT CCTCCCTAAA CCTCCAGTCT CTCCATTGGA CAGGTGTTGC 250 

ACGGTTGCTC ACCTGGTTTG TTTTGCTCCC CCTTTGGGCG ACCTTGCCAT 300 

CATGAGGTTC ACTTTGATCG AGGCGGTGGC TCTGACTGCC GTCTCGCTGG 350 

CCAGCGCTGT ACGTGCCGTT ACTTTGTCCT GAGAATTGCA ATTGTGCTTA 4 00 

ATTAGATTCA TTTGTTTGTT TCATCATCGC TGACAATGGT CTTTTCATAG 4 50 

GATGAATTGG CCTACTCCCC ACCGTATTAC CCATCCCCTT GGGCCAATGG 500 

CCAGGGCGAC TGGGCGCAGG CATACCAGCG CGCTGTTGAT ATTGTCTCGC 550 

AAATGACATT GGATGAGAAG GTCAATCTGA CCACAGGAAC TGGGTAGGGC 600 

TTACATGGCG CAATCTGTAT GCTCCGGCTA ACAACTTCTA CATGGGAATT 650 

GGAACTATGT GTTGGTCAGA CTGGCGGTGT TCCCCGGTAG GTTTGAAAAT 700 

ATTGTCGAGA CAGGGGACAT TATTGATTAA CGGTGACAGA TTGGGAGTTC 7 50 

CGGGAATGTG TTTACAGGAT AGCCCTCTGG GCGTTCGCGA CTGTAAGCCA 800 

TCTGCTGTTG TTAGGCTTCG ATGCTCTTAC TGACACGGCG CAGCCGACTA 850 

CAACTCTGCT TTCCCTGCCG GCATGAACGT GGCTGCAACC TGGGACAAGA 900 

ATCTGGCATA CCTTCGCGGC AAGGCTATGG GTCAGGAATT TAGTGACAAG 950 

GGTGCCGATA TCCAATTGGG TCCAGCTGCC GGCCCTCTCG GTAGAAGTCC 1000 

CGACGGTGGT CGTAACTGGG AGGGCTTCTC CCCAGACCCT GCCCTAAGTG 1050 

GTGTGCTCTT TGCCGAGACC ATCAAGGGTA TCCAAGATGC TGGTGTGGTT 1100 

GCGACGGCTA AGCACTACAT TGCTTACGAG CAAGAGCATT TCCGTCAGGC 1150 

GCCTGAAGCC CAAGGTTTTG GATTTAATAT TTCCGAGAGT GGAAGTGCGA 1200 

ACCTCGATGA TAAGACTATG CACGAGCTGT ACCTCTGGCC CTTCGCGGAT 1250 

GCCATCCGTG CAGGTGCTGG CGCTGTGATG TGCTCCTACA ACCAGATCAA 1300 

CAACAGTTAT GGCTGCCAGA ACAGCTACAC TCTGAACAAG CTGCTCAAGG 1350 

CCGAGCTGGG CTTCCAGGGC TTTGTCATGA GTGATTGGGC TGCTCACCAT 1400 

GCTGGTGTGA GTGGTGCTTT GGCAGGATTG GATATGTCTA TGCCAGGAGA 14 50 

CGTCGACTAC GACAGTGGTA CGTCTTACTG GGGTACAAAC TTGACCATTA 1500 

GCGTGCTCAA CGGAACGGTG CCCCAATGGC GTGTTGATGA CATGGCTGTC 1550 

CGCATCATGG CCGCCTACTA CAAGGTCGGC CGTGACCGTC TGTGGACTCC 1600 

TCCCAACTTC AGCTCATGGA CCAGAGATGA ATACGGCTAC AAGTACTACT 1650 

ACGTGTCGGA GGGACCGTAC GAGAAGGTCA ACCAGTACGT GAATGTGCAA 1700 

CGCAACCACA GCGAACTGAT TCGCCGCATT GGAGCGGACA GCACGGTGCT 1750 

CCTCAAGAAC GACGGCGCTC TGCCTTTGAC TGGTAAGGAG CGCCTGGTCG 1800 

CGCTTATCGG AGAAGATGCG GGCTCCAACC CTTATGGTGC CAACGGCTGC 1850 

AGTGACCGTG GATGCGACAA TGGAACATTG GCGATGGGCT GGGGAAGTGG 1900 

TACTGCCAAC TTCCCATACC TGGTGACCCC CGAGCAGGCC ATCTCAAACG 1950 

AGGTGCTTAA GCACAAGAAT GGTGTATTCA CCGCCACCGA TAACTGGGCT 2000 

ATCGATCAAA TTGAGGCGCT TGCTAAGACC GCCAGGTAAG AAGATCCCCG 2050 

ATTCTTTTCC TTCTTGTGCA ATGGATGCTG ACAACATGCT AGTGTCTCTC 2100 

TTGTCTTTGT CAACGCCGAC TCTGGTGAGG GTTACATCAA TGTGGACGGA 2150 

AACCTGGGTG ACCGCAGGAA CCTGACCCTG TGGAGGAACC GCGATAATGT 2200 

GATCAAGGCT GCTGCTAGCA ACTGCAACAA CACAATCGTT GTCATTCACT 2250 
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CTGTCGGACC 
GCTATCCTCT 
CGACGTCCTC 
GGGGCAAGAC 
AACGGCAACG 
CTACCGTGGA 
ATGGTCTGAG 
CTGAGCGCCC 
AACCTTCGGA 
TGCTGAGAAT 
GAGGCATCTT 
TCCCGAGGGA 
GCGGTCCTGG 
GTGACCATCA 
GGTAAGTAAA 
AGTATGTTTC 
TTCGAGCGCA 
TCTGACGCGC 
AGATTACGTC 
CTGCCGCTCC 
GTATACCATG 
AGACCACTGT 
AT GAT GAT AC 
GGCGGAAACC 
TTTTCCTCTT 
TGCCTCTACA 
ACACCACAAT 
GCCCTCTCCC 
TAACAGCGCT 
TCCGTCCTCT 
CCTCCACCCA 
ACCTCTCCTT 
AACATCTTCA 



AGTCTTGGTT 
GGGGTGGTTT 
TATGGCCGTG 
TCGTGAGGCC 
GAGCCCCTCA 
TTTGACAAGC 
CTACACCACT 
CTGCATACGA 
GAGGTTGGAA 
TACCAAGTTC 
CCGGGGATGC 
GCCACCGATG 
CGGCAACCCT 
AGAACACCGG 
CATGAGGTCC 
CCTTGGCGGT 
TCACGCTGCA 
CGTGACCTTG 
GTATCCCAAG 
GGGCGTCTCT 
ATGGCCGTGG 
AGATGTAGAT 
AAGCGATCCG 
AAGCGACGTC 
CAGCCTTCTT 
ACAACTAATT 
CTCGACATAA 
GATCCTCCAG 
CCTCATCTAC 
CTCCCACCGC 
CACGTCGAGC 
TCCTCCGCCG 
ATCTATATTC 



AACGAGTGGT 
GCCCGGTCAG 
TCAACCCCGG 
TACCAAGACT 
GGAAGACTTT 
GCAACGAGAC 
TTCAACTACT 
GCCTGCTTCG 
ATGCGTCGGA 
ATCTACCCCT 
TAGCTACGGG 
GCTCTGCGCA 
CGCCTGTACG 
CAAGGTTGCT 
GAACGAGGTT 
CCCAATGAGC 
GCCGTCGGAG 
CAAACTGGAA 
ATGGTGTTTG 
GCCTACTGTT 
TATATGAATT 
GTAGAATGAG 
ACACATGGTA 
ATCCGCTGCC 
CCTCCGCTTA 
GCCATAATAC 
TCACACAAGC 
CCCCAGCGAT 
CCAGCGACCC 
ACAAGGCACA 
CCCTAGCCCC 
CGCCCCGTCA 
GTCAAAGTCG 



ACGACAACCC 
GAGTCTGGCA 
TGCCAAGTCG 
ACTTGGTCAC 
GTCGAGGGCG 
CCCGATCTAC 
CGAACCTTGA 
GGTGAGACCG 
TTACCTCTAC 
GGCTCAACGG 
CAGGACTCCT 
ACCGATCCTG 
ACGAGCTCAT 
GGTGATGAAG 
GAACAAAGCT 
CCAAGATCGT 
GAGACGAAGT 
TGTTGAGAAG 
TCGGAAGCTC 
CACTAAATAG 
AATGATTTAT 
TATTGCGTAG 
GGAAGAGTGG 
GACTTCGCCA 
ATCCAGCAAC 
TCTACTCCTA 
CTGAACACAC 
ACGACCCTTC 
TAATCGTGGG 
CGCGACGACG 
GTCCATCGCG 
ACTACTTCGC 
GCTGG 



CAATGTTACC 
ACTCTCTTGC 
CCCTTTACCT 
CGAGCCCAAC 
TCTTCATTGA 
GAGTTCGGCT 
GGTGCAGGTG 
AGGCAGCGCC 
CCCAGCGGAT 
TACCGATCTC 
CCGACTATCT 
CCTGCCGGTG 
CCGCGTGTCA 
TTCCCCAACT 
AATCAGTCGC 
GCTGCGTCAA 
GGAGCACGAC 
CAGGACTGGG 
CTCGCGGAAG 
CTCTCAAATG 
GCCAACAGCA 
TAGCGTGTAG 
CGCTAGTTGG 
GTCTTTCTTC 
CATTGCCAAT 
TTCAATATAT 
GAGCAACCAT 
CAACCACCCA 
ATCACTCTTC 
GCTCCAGCAC 
TCAGACCTCA 
TCGCAAAGAC 



(2) 



INFORMATION FOR SEQ ID NO: 4: 



(i) 



<xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
<B) 
(C) 
<D> 



LENGTH: 
TYPE: 

STRANDEDNESS: 
TOPOLOGY : 



Ser Pro Pro Tyr Tyr Pro 
5 



SEQUENCE DESCRIPTION: 



amino acid 

single 

linear 

SEQ ID NO:4 : 



2300 
2350 
2400 
2450 
2500 
2550 
2600 
2650 
2700 
2750 
2800 
2850 
2900 
2950 
3000 
3050 
3100 
3150 
3200 
3250 
3300 
3350 
3400 
3450 
3500 
3550 
3600 
3650 
3700 
3750 
3800 
3850 
3885 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
SNCCNCCNTA YTAYCC 16 



(2) 



INFORMATION FOR SEQ ID NO: 6: 



<i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS: 
TOPOLOGY : 



SEQUENCE DESCRIPTION: 



Gin Pro lie Leu Pro Ala Gly Gly 
5 



8 

amino acid 

single 

linear 

SEQ ID NO:6: 



(2) 



INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 



(Xi) 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 
.(D) TOPOLOGY: 
SEQUENCE DESCRIPTION: 



TCCIGCNGGN ARDATNGGYT G 21 



21 

nucleic acid 

single 

linear 

SEQ ID NO: 7: 



(2) 



INFORMATION FOR SEQ ID NO: 8: 



(i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS: 
TOPOLOGY : 



SEQUENCE DESCRIPTION: 



AAACCATGGC TGATGAATTG GCATACTCCC CACC 34 



34 

nucleic acid 
single 
linear 
SEQ ID NO: 8: 
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(2) 



INFORMATION FOR SEQ ID NO: 9: 



(i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH: 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : 



SEQUENCE DESCRIPTION: 



AAAGGATCCT TAGTGAACAG TAGGCAGAGA CGC 33 



33 

nucleic acid 

single 

linear 

SEQ ID NO:9: 



(2) 



INFORMATION FOR SEQ ID NO: 10: 
(i) 



SEQUENCE CHARA: 

(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 
SEQUENCE DESCRIPTION: 

Asp Glu Leu Ala Tyr Ser Pro Pro Tyr 
5 

Asn Gly Gin Gly Asp 
20 



(2) 



(xi) 



19 

amino acid 

single 

linear 

SEQ ID NO: 10: 
Tyr Pro Ser Pro Trp Ala 
10 is 



INFORMATION FOR SEQ ID NO: 11: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

. (B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 : 
Val Leu Lys His Lys Asn Gly Val Phe Thr Ala Thr Asp Asn Trp 
5 10 15 

Ala lie Asp Gin He Glu Ala Leu Ala Lys 
20 25 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
{D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
Gly Ala Thr Asp Gly Ser Ala Gin Pro He Leu Pro Ala Gly Gly 



Gly Pro Gly Gly Asn Pro 
20 



10 



15 



(2) 



INFORMATION FOR SEQ ID NO: 13: 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 



LENGTH: 
TYPE: 

STRANDEDNESS : 



3212 

nucleic acid 
double 



(D). TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID 


NO: 13: 




GAATTCCCGA 


TCCTATCTGT 


CACTTCATCA 


AAAGG AC AG T 


AGAAAAGGAA 


50 


GGTGGCACTA 


CAAATGCCAT 


CATTGCGATA 


AAGGAAAGGC 


TATCGTTCAA 


100 


GATGCCTCTG 


CCGACAGTGG 


TCCCAAAGAT 


GGACCCCCAC 


CC AC GAG GAG 


150 


CATCGTGGAA 


AAAGAAGACG 


TTCCAACCAC 


GTCTTCAAAG 


CAAGTGGATT 


200 


GATGTGATAT 


CTCCACTGAC 


GTAAGGGATG 


ACGCACAATC 


CCACTATCCT 


250 


TCGCAAGACC 


CTTCCTCTAT 


ATAAGGAAGT 


TCATTTCATT 


TGGAGAGGAC 


300 


AGGCTTCTTG 


AGATCCTTCA 


ACAATTACCA 


ACAACAACAA 


ACAACAAACA 


350 


ACATTACAAT 


TACTATTTAC 


AATTACAGTC 


GACCATGGCT 


GATGAATTGG 


400 


CCTACTCCCC 


ACCGTATTAC 


CCATCCCCTT 


GGGCCAATGG 


CCAGGGCGAC 


450 


TGGGCGCAGG 


CATACCAGCG 


CGCTGTTGAT 


ATTGTCTCGC 


AAATGACATT 


500 


GGATGAGAAG 


GTCAATCTGA 


CCACAGGAAC 


TGGATGGGAA 


TTGGAACTAT 


550 


GTGTTGGTCA 


GACTGGCGGT 


GTTCCCCGAT 


TGGGAGTTCC 


GGGAATGTGT 


600 


TTACAGGATA 


GCCCTCTGGG 


CGTTCGCGAC 


TCCGACTACA 


ACTCTGCTTT 


650 


CCCTGCCGGC 


ATGAACGTGG 


CTGCGACCTG 


GGACAAGAAT 


CTGGCATACC 


700 
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TTCGCGGCAA GGCTATGGGT CAGGAATTTA GTGACAAGGG TGCCGATATC 750 
CAATTGGGTC CAGCTGCCGG CCCTCTCGGT AGAAGTCCCG ACGGTGGTCG 800 
TAACTGGGAG GGCTTCTCCC CAGACCCTGC CCTAAGTGGT GTGCTCTTTG 850 
CCGAGACCAT CAAGGGTATC CAAGATGCTG GTGTGGTTGC GACGGCTAAG 900 
CACTACATTG CTTACGAGCA AGAGCATTTC CGTCAGGCGC CTGAAGCCCA 950 
AGGTTTTGGA TTTAATATTT CCGAGAGTGG AAGTGCGAAC CTCGATGATA 1000 
AGACTATGCA CGAGCTGTAC CTCTGGCCCT TCGCGGATGC CATCCGTGCA 1050 
GGTGCTGGCG CTGTGATGTG CTCCTACAAC CAGATCAACA ACAGTTATGG 1100 
CTGCCAGAAC AGCTACACTC TGAACAAGCT GCTCAAGGCC GAGCTGGGCT 1150 
TCCAGGGCTT TGTCATGAGT GATTGGGCTG CTCACCATGC TGGTGTGAGT 1200 
GGTGCTTTGG CAGGATTGGA TATGTCTATG CCAGGAGACG TCGACTACGA 12 50 
CAGTGGTACG TCTTACTGGG GTACAAACTT GACCATTAGC GTGCTCAACG 1300 
GAACGGTGCC CCAATGGCGT GTTGATGACA TGGCTGTCCG CATCATGGCC 1350 
GCCTACTACA AGGTCGGCCG TGACCGTCTG TGGACTCCTC CCAACTTCAG 1400 
CTCATGGACC AGAGATGAAT ACGGCTACAA GTACTACTAC GTGTCGGAGG 1450 
GACCGTACGA GAAGGTCAAC CAGTACGTGA ATGTGCAACG CAACCACAGC 1500 
GAACTGATTC GCCGCATTGG AGCGGACAGC ACGGTGCTCC TCAAGAACGA 1550 
CGGCGCTCTG CCTTTGACTG GTAAGGAGCG CCTGGTCGCG CTTATCGGAG 1600 
AAGATGCGGG CTCCAACCCT TATGGTGCCA ACGGCTGCAG TGACCGTGGA 1650 
TGCGACAATG GAACATTGGC GATGGGCTGG GGAAGTGGTA CTGCCAACTT 1700 
CCCATACCTG GTGACCCCCG AGCAGGCCAT CTCAAACGAG GTGCTTAAGC 1750 
ACAAGAATGG TGTATTCACC GCCACCGATA ACTGGGCTAT CGATCAAATT 1800 
GAGGCGCTTG CTAAGACCGC CAGTGTCTCT CTTGTCTTTG TCAACGCCGA 1850 
CTCTGGTGAG GGTTACATCA ATGTGGACGG AAACCTGGGT GACCGCAGGA 1900 
ACCTGACCCT GTGGAGGAAC GGCGATAATG TGATCAAGGC TGCTGCTAGC 1950 
AACTGCAACA ACACAATCGT TGTCATTCAC TCTGTCGGAC CAGTCTTGGT 2000 
TAACGAGTGG TACGACAACC CCAATGTTAC CGCTATCCTC TGGGGTGGTT 2050 
TGCCCGGTCA GGAGTCTGGC AACTCTCTTG CCGACGTCCT CTATGGCCGT 2100 
GTCAACCCCG GTGCCAAGTC GCCCTTTACC TGGGGCAAGA CTCGTGAGGC 2150 
CTACCAAGAC TACTTGGTCA CCGAGCCCAA CAACGGCAAC GGAGCCCCTC 2200 
AGGAAGACTT TGTCGAGGGC GTCTTCATTG ACTACCGTGG ATTTGACAAG 2250 
CGCAACGAGA CCCCGATCTA CGAGTTCGGC TATGGTCTGA GCTACACCAC 2300 
TTTCAACTAC TCGAACCTTG AGGTGCAGGT GCTGAGCGCC CCTGCATACG 2350 
AGCCTGCTTC GGGTGAGACC GAGGCAGCGC CAACCTTCGG AGAGGTTGGA 2400 
AATGCGTCGG ATTACCTCTA CCCCAGCGGA TTGCTGAGAA TTACCAAGTT 2450 
CATCTACCCC TGGCTCAACG GTACCGATCT CGAGGCATCT TCCGGGGATG 2500 
CTAGCTACGG GCAGGACTCC TCCGACTATC TTCCCGAGGG AGCCACCGAT 2550 
GGCTCTGCGC AACCGATCCT GCCTGCCGGT GGCGGTCCTG GCGGCAACCC 2600 
TCGCCTGTAC GACGAGCTCA TCCGCGTGTC AGTGACCATC AAGAACACCG 2650 
GCAAGGTTGC TGGTGATGAA GTTCCCCAAC TGTATGTTTC CCTTGGCGGT 2700 
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V»G l GV^<J 1 ^rtM 




TGACGCTGCA 


2750 




c a r; zx c r; a ar: t 


GGAGGAUIjAG 


ill GAGGGGG 


CGTGACCTTG 


2800 


C a A a C T c c & a 


t cz t t f: a (2 a a ft 

loll UrtuHny 






GTATCCCAAG 


2850 




Tprrn arPTf* 
1 tourtrtuL 1 G 


GTCGCGGAAG 


Gi GGCGCTCC 


GGGCGTCTCT 


2900 


GCCTACTGT r 


CACTAACCCG 


GGCGAGCTCG 


AATTGATCGT 


TCAAACATTT 


2950 


GG V- AA T AAA G 


III Gl I AAGA 


TTGAATCCTG 


TTGCCGGTCT 


TGCGATGATT 


3000 


/\ 1 Lft 1 A 1 HJ\ L 


1 ILKjl i uAA 


TTACGTTAAG 


CATGTAATAA 


TTAAACATGT 


3050 


AATGCATGAC 


GTTATTTATG 


AGATGGGGTT 


TTTATGATTA 


AGAGTCCCCG 


3100 


CAATTATACA 


TTTTAATACG 


CGATAGAAAA 


ACAAAATATA 


GCGCCCAAAC 


3150 


taaggataaa 


ATTATTCGCG 


CCGCGGGGGG 


GCATTCTATG 


GTTACTAGAT 


3200 


ctctagaatt 


CC 








3212 



(2} INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 841 

(B) TYPE: amino acid 

(C) STRAN: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



Asp 


Glu 


Leu 


Ala 


Tyr 


Ser 


Pro 


Pro 


Tyr 


Tyr 


Pro 


Ser 


Pro 


TrD 


Ala 










5 










10 








15 


Asn 


Gly Gin 


Gly 


Asp 


Trp 


Ala 


Gin 


Ala 


Tyr 


Gin 


Arg 


Ala 


Val 


Asp 










20 










25 








30 


lie 


Val 


Ser 


Gin 


Met 
35 


Thr 


Leu 


Asp 


Glu 


Lys 
40 


val 


Asn 


Leu 


Thr 


Thr 
45 


Gly 


Thr 


Gly 


Trp 


Glu 


Leu 


Glu 


Leu 


Cys 


Val 


Gly 


Gin 


Thr 


Gly 


Glv 










50 










55 






60 


Val 


Pro 


Arg 


Leu 


Gly 


Val 


Pro 


Glv 


Met 


Cys 


Leu 


Gin 


Asp 


Ser 


Pro 










65 










70 








7 5 


Leu 


Gly Val 


Arg 


Asp 


Ser 


Asp 


Tyr 


Asn 


Ser 


Ala 


Phe 


Pro 


Ala 


Gly 










80 










85 










90 


Met 


Asn 


Val 


Ala 


Ala 
95 


Thr 


Trp 


Asp 


Lys 


Asn 
100 


Leu 


Ala 


Tyr 


Leu 


Arg 
105 


Gly 


Lys 


Ala 


Met 


Gly 


Gin 


Glu 


Phe 


Ser 


Asp 


Lys 


Gly 


Ala 


Asp 


He 










110 










115 








120 


Gin 


Leu 


Gly 


Pro 


Ala 
125 


Ala 


Gly 


Pro 


Leu 


Gly 
130 


Arg 


Ser 


Pro 


Asp 


Gly 
135 


Gly 


Arg 


Asn 


Trp 


Glu 
140 


Gly 


Phe 


Ser 


Pro 


Asp 
145 


Pro 


Ala 


Leu 


Ser 


Gly 
150 


Val 


Leu 


Phe 


Ala 


Glu 


Thr 


He 


Lys 


Gly 


He 


Gin 


Asp 


Ala 


Gly 


Val 










155 










160 








165 


Val 


Ala 


Thr 


Ala 


Lys 
170 


His 


Tyr 


He 


Ala 


Tyr 
175 


Glu 


Gin 


Glu 


His 


Phe 
180 


Arg 


Gin 


Ala 


Pro 


Glu 
185 


Ala 


Gin 


Gly 


Phe 


Gly 
190 


Phe 


Asn 


He 


Ser 


Glu 
195 


Ser 


Gly 


Ser 


Ala 


Asn 
200 


Leu 


Asp 


Asp 


Lys 


Thr 
205 


Met 


His 


Glu 


Leu 


Tyr 
210 


Leu 


Trp 


Pro 


Phe 


Ala 


Asp 


Ala 


He 


Arg 


Ala 


Gly 


Ala 


Gly 


Ala 


Val 










215 










220 






225 


Met 


Cys 


Ser 


Tyr 


Asn 


Gin 


He 


Asn 


Asn 


Ser 


Tyr 


Gly 


Cys 


Gin 


Asn 










230 










235 




240 


Ser 


Tyr Thr 


Leu 


Asn 


Lys 


Leu 


Leu 


Lys 


Ala 


Glu 


Leu 


Gly 


Phe 


Gin 










245 










250 








255 


Gly 


Phe 


val 


Met 


Ser 


Asp 


Trp 


Ala 


Ala 


His 


His 


Ala 


Gly 


Val 


Ser 










260 








265 








270 


Gly 


Ala 


Leu 


Ala 


Gly 
275 


Leu 


Asp 


Met 


Ser 


Met 
280 


Pro 


Gly 


Asp 


Val 


Asp 
285 


Tyr 


Asp 


Ser 


Gly 


Thr 


Ser 


Tyr 


Trp 


Gly 


Thr 


Asn 


Leu 


Thr 


He 


Ser 










290 




295 










300 


Val 


Leu 


Asn 


Gly 


Thr 


Val 


Pro 


Gin 


Trp Arg 


val 


Asp 


Asp 


Met 


Ala 










305 










310 








315 


val 


Arg 


He 


Met 


Ala 

320 


Ala 


Tyr 


.Tyr 


Lys 


val 
325 


Gly 


Arg 


Asp 


Arg 


Leu 
330 


Trp 


Thr 


Pro 


Pro 


Asn 


Phe 


Ser 


Ser 


Trp 


Thr 


Arg 


Asp 


Glu 


Tyr 


Gly 










335 








340 




345 


Tyr 


Lys 


Tyr 


Tyr 


Tyr 


Val 


Ser 


Glu 


Gly 


Pro 


Tyr 


Glu 


Lys 


Val 


Asn 


Gin 








350 








355 






360 


Tyr Val 


Asn 


Val 


Gin 


Arg 


Asn 


His 


Ser 


Glu 


Leu 


He 


Arg 


Arg 



BNSDOCID: <WO 0136586A2_I_> 



WO 01/36586 



PCT/IL00/00758 











365 






He 


Gly 


Ala 


Asp 


Ser 
380 


Thr 


val 


Pro 


Leu 


Thr 


Gly 


Lys 


Glu 


Arg 










395 




Ala 


Gly 


Ser 


Asn 


Pro 


Tyr 


Gly 










410 




Cys 


Asp 


Asn 


Gly 


Thr 
425 


Leu 


Ala 


Asn 


Phe 


Pro 


Tyr 


Leu 
440 


Val 


Thr 


val 


Leu 


Lys 


His 


Lys 
455 


Asn 


Gly 


Ala 


He 


Asp 


Gin 


He 
470 


Glu 


Ala 


Leu 


Val 


Phe 


Val 


Asn 


Ala 


Asp 










485 




Asp 


Gly 


Asn 


Leu 


Gly 
500 


Asp 


Arg 


Gly 


Asp 


Asn 


Val 


He 
515 


Lys 


Ala 


He 


Val 


Val 


lie 


His 
530 


Ser 


val 


Tyr 


Asp 


Asn 


Pro 


Asn 
545 


val 


Thr 


Gly 


Gin 


Glu 


Ser 


Gly 
560 


Asn 


Ser 


Val 


Asn 


Pro 


Gly 


Ala 
575 


Lys 


Ser 


Glu 


Ala 


Tyr 


Gin 


Asp 
590 


Tyr 


Leu 


Gly 


Ala 


Pro 


Gin 


Glu 
605 


Asp 


Phe 


Ara 


Glv 


Phe 


Asp 


620 


Arg 


Asn 


Tyr 


Gly 


Leu 


Ser 


Tyr 
635 


Thr 


Thr 


Gin 


Val 


Leu 


Ser 


Ala 

650 


Pro 


Ala 


Glu 


Ala 


Ala 


Pro 


Thr 


Phe 


Gly 










665 




Leu 


Tyr 


Pro 


Ser 


Gly 
680 


Leu 


Leu 


Trp 


Leu 


Asn 


Gly 


Thr 
695 


Asp 


Leu 


Tyr 


Gly 


Gin 


Asp 


Ser 
710 


Ser 


Asp 


Gly 


Ser 


Ala 


Gin 


Pro 
7 25 


He 


Leu 


Asn 


Pro 


Arg 


Leu 


Tyr 
740 


Asp 


Glu 


Lys 


Asn 


Thr 


Gly 


Lys 
755 


val 


Ala 


Val 


Ser 


Leu 


Gly 


Gly 
770 


Pro 


Asn 


Phe 


Glu 


Arg 


He 


Thr 
785 


Leu 


Gin 


Thr 


Thr 


Leu 


Thr 


Arg 
800 


Arg 


Asp 


Gin 


Asp 


Trp 


Glu 


He 
815 


Thr 


Ser 


Ser 


Ser 


Ser 


Arg 


Lys 
830 


Leu 


Pro 



His 



370 375 



Leu 


Leu 


Lys 


Asn 


Asp 


(Z 1 i/ 

uiy 


Aj.a 


Leu 






385 










3 90 


Leu 


Val 


Ala 


Leu 


He 


oiy 


blU 


Asp 






400 








h yj d 


Ala 


Asn 


Glv 


C vs 


Ser 




Arg 


uiy 






415 








420 


Met 


Glv 

j 


Trn 


Glv 


Ser 


Gly 


Th t~ 

i nr 


»1 a 






430 








a ~i c ■ 


Pro 


Glu 


Gin 


Ala 


Tip 
lie 


O fcf 1 


Asn 


blU 






445 










4 ^n 


val 


Phe 


Thr 


Ala 


Thr 


sp 


Asn 


Trp 






4 60 










/tec 


L U 

eu 


Ala 




Thr 
inr 


ai a 
mi a 


t>er 


Val 


Ser 






475 










4 80 


Ser 


Gly 


Glu 


nit/ 
val y 


Tyr 


ne 


Asn 


Val 






4 90 










4 95 


Arg 


sn 


Leu 


Thr 


Leu 


Trp 


Arg 


Asn 






*»ns 










510 


nld 


Ml a 


Cor- 
ner 


Asn 


Cys 


Asn 


Asn 


Thr 






520 










tot 


Gly 


ro 


Val 


Leu 


val 


Asn 


ulu 


Trp 






535 










c / A 
D *1 U 


Ala 


lie 




Trp Gly 


biy 


Leu 


Pro 






550 












Leu 


Ala 




Val 


Leu 


Tyr 


uiy 


Arg 






565 










D / U 


Pro 


Phe 


Thr 


Trp Gly 


Lys 


TVir 

i nr 


Arg 
















CDC 


Val 


Thr 


Glu 


Pro 


Asn 


Asn 


ri„ 

wiy 


Asn 






D j D 








bUO 


Vdl 


ulu 


r l ». 
uiy 


Val 


Phe 


I ±e 


Asp 


Tyr 






610 










Ol J 


Glu 


Thr 


Pro 


He 


Tyr 


ulu 


Phe 


Gly 






625 












Phe 


Asn 


Tyr 


Ser 


Asn 


Leu 


ri .1 


Val 






640 










D H J 


Tyr 


ft ii 


rfO 


Ala 


Ser 


Gly 


V3 J.U 


Thr 






655 










DDU 




Vdl 


tji y 


Asn 


Ala 


Ser 


Asp 


Tyr 






670 










O / D 


Arg 


. lie 


inr 


Lys 


Phe 


He 


Tyr 


Pro 






DO o 










690 


oiU 




Ser 


Ser 


Gly Asp 


Ala 


Ser 
















7 05 


Tyr 


Leu 


Pro 


Glu 


Gly 


Ala 


Thr 


Asp 






Ti c 
/ 1j 










720 


r ro 


ni a 


vjiy 


Gly Gly 


Pro 


Gly 


Gly 






/ JU 










/Jo 


eu 


Tip 

lie 




val 


Ser 


Val 


i n r 


Tl a 

lie 






74 5 










Ten 


Glv 


Asp 


Glu 


val 


Pro 


Gin 


eu 


Tyr 






760 










765 


Glu 


Pro 


Lvs 


He 


Val 


Leu 


Ar 
rg 


Gin 






775 










780 


Pro 


Ser 


Glu 


Glu 


Thr 


Lys 


Trp 


Ser 






790 










795 


Leu 


Ala 


Asn 


Trp 


Asn 


Val 


Glu 


Lys 






805 










810 


Tyr 


Pro 


Lys 


Met 


Val 


Phe 


Val 


Gly 






820 










825 


Leu 


Arg 


Ala 


Ser 


Leu 


Pro 


Thr 


Val 



835 840 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3329 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



GAATTCCCGA TCCTATCTGT CACTTCATCA AAAGGACAGT AGAAAAGGAA 50 
GGTGGCACTA CAAATGCCAT CATTGCGATA AAGGAAAGGC TATCGTTCAA 100 
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GATGCCTCTG 


CCGACAGTGG 


TCCCAAAGAT 


GGACCCCCAC 


CCACGAGGAG 


150 


CATCGTGGAA 


AAAGAAGACG 


TTCCAACCAC 


GTCTTCAAAG 


CAAGTGGATT 


200 


GATGTGATAT 


CTCCACTGAC 


GTAAGGGATG 


ACGCACAATC 


CCACTATCCT 


250 


TCGCAAGACC 


CTTCCTCTAT 


ATAAGGAAGT 


TCATTTCATT 


TGGAGAGGAC 


300 


AGGCTTCTTG 


AGATCCTTCA 


ACAATTACCA 


ACAACAACAA 


ACAACAAACA 


350 


ACATTACAAT 


TACTATTTAC 


AATTACAGTC 


GAGGGGATCT 


ATGGCGCGAA 


400 


AATCCCTAAT 


TTTCCCGGTG 


ATTTTGCTCG 


CCGTTCTTCT 


CTTCTCTCCG 


4 50 


CCGATTTACT 


CCGCCGGTCA 


CGATTACCGC 


GACGCTCTCC 


GTAAATCTAG 


500 


CATGGCTGAT 


GAATTGGCCT 


ACTCCCCACC 


GTATTACCCA 


TCCCCTTGGG 


550 


CCAATGGCCA 


GGGCGACTGG 


GCGCAGGCAT 


ACCAGCGCGC 


TGTTGATATT 


600 


GTCTCGCAAA 


TGACATTGGA 


TGAGAAGGTC 


AATCTGACCA 


CAGGAACTGG 


650 


ATGGGAATTG 


GAACTATGTG 


TTGGTCAGAC 


TGGCGGTGTT 


CCCCGATTGG 


700 


GAGTTCCGGG 


AATGTGTTTA 


CAGGATAGCC 


CTCTGGGCGT 


TCGCGACTCC 


750 


GACTACAACT 


CTGCTTTCCC 


TGCCGGCATG 


AACGTGGCTG 


CGACCTGGGA 


800 


CAAGAATCTG 


GCATACCTTC 


GCGGCAAGGC 


TATGGGTCAG 


GAATTTAGTG 


850 


ACAAGGGTGC 


CGATATCCAA 


TTGGGTCCAG 


CTGCCGGCCC 


TCTCGGTAGA 


900 


AGTCCCGACG 


GTGGTCGTAA 


CTGGGAGGGC 


TTCTCCCCAG 


ACCCTGCCCT 


950 


AAGTGGTGTG 


CTCTTTGCCG 


AGACCATCAA 


GGGTATCCAA 


GATGCTGGTG 


1000 


TGGTTGCGAC 


GGCTAAGCAC 


TACATTGCTT 


ACGAGCAAGA 


GCATTTCCGT 


1050 


CAGGCGCCTG 


AAGCCCAAGG 


TTTTGGATTT 


AATATTTCCG 


AGAGTGGAAG 


1100 


TGCGAACCTC 


GATGATAAGA 


CTATGCACGA 


GCTGTACCTC 


TGGCCCTTCG 


1150 


CGGATGCCAT 


CCGTGCAGGT 


GCTGGCGCTG 


TGATGTGCTC 


CTACAACCAG 


1200 


ATCAACAACA 


GTTATGGCTG 


CCAGAACAGC 


TACACTCTGA 


ACAAGCTGCT 


1250 


CAAGGCCGAG 


CTGGGCTTCC 


AGGGCTTTGT 


CATGAGTGAT 


TGGGCTGCTC 


1300 


ACCATGCTGG 


TGTGAGTGGT 


GCTTTGGCAG 


GATTGGATAT 


GTCTATGCCA 


1350 


GGAGACGTCG 


ACTACGACAG 


TGGTACGTCT 


TACTGGGGTA 


CAAACTTGAC 


1400 


CATTAGCGTG 


CTCAACGGAA 


CGGTGCCCCA 


ATGGCGTGTT 


GATGACATGG 


1450 


CTGTCCGCAT 


CATGGCCGCC 


TACTACAAGG 


TCGGCCGTGA 


CCGTCTGTGG 


1500 


ACTCCTCCCA 


ACTTCAGCTC 


ATGGACCAGA 


GATGAATACG 


GCTACAAGTA 


1550 


CTACTACGTG 


TCGGAGGGAC 


CGTACGAGAA 


GGTCAACCAG 


TACGTGAATG 


1600 


TGCAACGCAA 


CCACAGCGAA 


CTGATTCGCC 


GCATTGGAGC 


GGACAGCACG 


1650 


GTGCTCCTCA 


AGAACGACGG 


CGCTCTGCCT 


TTGACTGGTA 


AGGAGCGCCT 


1700 


GGTCGCGCTT 


ATCGGAGAAG 


ATGCGGGCTC 


CAACCCTTAT 


GGTGCCAACG 


1750 


GCTGCAGTGA 


CCGTGGATGC 


GACAATGGAA 


CATTGGCGAT 


GGGCTGGGGA 


1800 


AGTGGTACTG 


CCAACTTCCC 


ATACCTGGTG 


ACCCCCGAGC 


AGGCCATCTC 


1850 


AAACGAGGTG 


CTTAAGCACA 


AGAATGGTGT 


ATTCACCGCC 


ACCGATAACT 


1900 


GGGCTATCGA 


TCAAATTGAG 


GCGCTTGCTA 


ft. a ft c c cie* r* ft r* 


*P ^ T* C* T» /"* T> r* T> 

I GTCTCTCTT 


1 950 


GTCTTTGTCA 


ACGCCGACTC 


TGGTGAGGGT 


TACATCAATG 


TGGACGGAAA 


2000 


CCTGGGTGAC 


CGCAGGAACC 


TGACCCTGTG 


GAGGAACGGC 


GATAATGTGA 


2050 


TCAAGGCTGC 


TGCTAGCAAC 


TGCAACAACA 


CAATCGTTGT 


CATTCACTCT 


2100 



BNSDOCID: <WO 0136586A2J_> 



' WO 01/36586 

GTCGGACCAG TCTTGGTTAA CGAGTGGTAC 
TATCCTCTGG GGTGGTTTGC CCGGTCAGGA 
ACGTCCTCTA TGGCCGTGTC AACCCCGGTG 
GGCAAGACTC GTGAGGCCTA CCAAGACTAC 
CGGCAACGGA GCCCCTCAGG AAGACTTTGT 
ACCGTGGATT TGACAAGCGC AACGAGACCC 
GGTCTGAGCT ACACCACTTT CAACTACTCG 
GAGCGCCCCT GCATACGAGC CTGCTTCGGG 
CCTTCGGAGA GGTTGGAAAT GCGTCGGATT 
CTGAGAATTA CCAAGTTCAT CTACCCCTGG 
GGCATCTTCC GGGGATGCTA GCTACGGGCA 
CCGAGGGAGC CACCGATGGC TCTGCGCAAC 
GGTCCTGGCG GCAACCCTCG CCTGTACGAC 
GACCATCAAG AACACCGGCA AGGTTGCTGG 
ATGTTTCCCT TGGCGGTCCC AATGAGCCCA 
GAGCGCATCA CGCTGCAGCC GTCGGAGGAG 
GACGCGCCGT GACCTTGCAA ACTGGAATGT 
TTACGTCGTA TCCCAAGATG GTGTTTGTCG 
CCGCTCCGGG CGTCTCTGCC TACTGTTCAC 
TGATCGTTCA AACATTTGGC AATAAAGTTT 
CCGGTCTTGC GATGATTATC ATATAATTTC 
GTAATAATTA AACATGTAAT GCATGACGTT 
ATGATTAAGA GTCCCCGCAA TTATACATTT 
AAATATAGCG CCCAAACTAA GGATAAAATT 
TTCTATGGTT ACTAGATCTC TAGAATTCC 
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GACAACCCCA ATGTTACCGC 2150 

GTCTGGCAAC TCTCTTGCCG 2200 

CCAAGTCGCC CTTTACCTGG 22 50 

TTGGTCACCG AGCCCAACAA .2300 

CGAGGGCGTC TTCATTGACT 2 3 50 

CGATCTACGA GTTCGGCTAT 2 400 

AACCTTGAGG TGCAGGTGCT 24 50 

TGAGACCGAG GCAGCGCCAA 2500 

ACCTCTACCC CAGCGGATTG 2550 

CTCAACGGTA CCGATCTCGA 2600 

GGACTCCTCC GACTATCTTC 2 650 

CGATCCTGCC TGCCGGTGGC 27 00 

GAGCTCATCC GCGTGTCAGT 2750 

TGATGAAGTT CCCCAACTGT 2800 

AGATCGTGCT GCGTCAATTC 2850 

ACGAAGTGGA GCACGACTCT 2 900 

TGAGAAGCAG GACTGGGAGA 2 950 

GAAGCTCCTC GCGGAAGCTG 3000 

TAACCCGGGC GAGCTCGAAT 3050 

CTTAAGATTG AATCCTGTTG 3100 

TGTTGAATTA CGTTAAGCAT 3150 

ATTTATGAGA TGGGGTTTTT 3200 

TAATACGCGA TAGAAAAACA 3250 

ATTCGCGCCG CGGGGGGGCA 3300 

3329 



(2) INFORMATION FOR SEQ ID NO: 16: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 880 

(B) TYPE: amino acid 











(C) 


STRAN 






single 
















(D) 


TOPOLOGY : 




linear 












(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


NO:16: 




Met 


Ala 


Arg 


Lys 


Ser 


Leu 


He 


Phe 


Pro 


Val 


He 


Leu 


Leu Ala 


Val 










5 










10 








15 


Leu 


leu 


Phe 


Ser 


Pro 


Pro 


He 


Tyr 


Ser 


Ala 


Gly 


His 


Asp Tyr 


Arg 










20 










25 






30 


Asp 


Ala 


Leu 


Arg 


Lys 


Ser 


Ser 


Met 


Ala 


Asp 


Glu 


Leu 


Ala Tyr 


Ser 










35 










40 






45 


Pro 


Pro 


Tyr 


Tyr 


Pro 


Ser 


Pro 


Trp 


Ala 


Asn 


Gly 


Gin 


Gly Asp 


Trp 


Ala 


Gin 






50 










55 






60 


Ala 


Tyr 


Gin 


Arg 


Ala 


Val 


Asp 


He 


Val 


Ser 


Gin Met 


Thr 










65 










70 








75 


Leu 


Asp 


Glu 


Lys 


val 


Asn 


Leu 


Thr 


Thr 


Gly 


Thr 


Gly Trp Glu 


Leu 










80 










85 








90 


Glu 


Leu 


Cys 


Val 


Gly 


Gin 


Thr 


Gly 


Gly 


Val 


Pro 


Arg 


Leu Gly 


Val 


Pro 








95 










100 




105 


Gly Met 


Cys 


Leu 


Gin 


Asp 


Ser 


Pro 


Leu 


Gly 


Val 


Arg Asp 


Ser 










110 










115 






120 


Asp 


Tyr 


Asn 


Ser 


Ala 


Phe 


Pro 


Ala 


Gly 


Met 


Asn 


Val 


Ala Ala 


Thr 










125 










130 








135 


Trp 


Asp 


Lys 


Asn 


Leu 


Ala 


Tyr 


Leu 


Arg 


Gly 


Lys 


Ala 


Met Gly 


Gin 










140 










145 








150 


Glu 


Phe 


Ser 


Asp 


Lys 


Gly 


Ala 


Asp 


He 


Gin 


Leu 


Gly 


Pro Ala 


Ala 
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155 










160 










165 


Gly 


Pro 


Leu 


Gly Arg 


Ser 


Pro 


Asp 


Gly 


Gly 


Arg 


Asn 


Trp Glu 


Gly 








170 










175 










180 


Phe 


Ser 


Pro 


Asp Pro 


Ala 


Leu 


Ser Gly Val 


Leu 


Phe 


Ala 


Glu 


Thr 








185 










190 










195 


He 


Lys 


Gly 


He Gin 


Asp 


Ala 


Gly Val 


Val 


Ala 


Thr 


Ala 


Lys 


His 








200 










205 








210 


Tyr 


He 


Ala 


Tyr Glu 


Gin 


Glu 


His 


Phe 


Arg 


Gin 


Ala 


Pro 


Glu 


Ala 








215 










220 










225 


Gin 


Gly 


Phe 


Gly Phe 


Asn 


He 


Ser 


Glu 


Ser 


Gly 


Ser 


Ala 


Asn 


Leu 








230 










235 








240 


Asp Asp Lys 


Thr Met 


His 


Glu 


Leu 


Tyr 


Leu 


Trp 


Pro 


Phe 


Ala 


Asp 








245 










250 










255 


Ala 


He 


Arg 


Ala Gly 


Ala 


Gly 


Ala 


val 


Met 


Cys 


Ser 


Tyr 


Asn 


Gin 








260 










265 








270 


He 


Asn 


Asn 


Ser Tyr 


Gly 


Cys 


Gin 


Asn 


Ser 


Tyr 


Thr 


Leu 


Asn 


Lys 








275 










280 










285 


Leu 


Leu 


Lys 


Ala Glu 


Leu 


Gly 


Phe 


Gin 


Gly 


Phe 


val 


Met 


Ser 


Asp 








290 










295 










300 


Trp 


Ala 


Ala 


His His 


Ala 


Gly 


Val 


Ser 


Gly 


Ala 


Leu 


Ala 


Gly 


Leu 








305 










310 








315 


Asp 


Met 


Ser 


Met Pro 


Gly Asp Val Asp Tyr Asp Ser 


Gly Thr 


Ser 








320 










325 










330 


Tyr 


Trp Gly 


Thr Asn 


Leu 


Thr 


He 


Ser 


Val 


Leu 


Asn 


Gly Thr 


Val 








335 










340 










345 


Pro 


Gin 


Trp Arg Val 


Asp 


Asp 


Met 


Ala 


Val 


Arg 


He 


Met 


Ala 


Ala 








350 










355 








360 


Tyr 


Tyr 


Lys 


Val Gly 


Arg 


Asp Arg 


Leu 


Trp 


Thr 


Pro 


Pro 


Asn 


Phe 








365 










370 










375 


Ser 


Ser 


Trp Thr Arg 


Asp 


Glu 


Tyr Gly 


Tyr 


Lys 


Tyr 


Tyr 


Tyr 


Val 








380 










385 










390 


Ser 


Glu 


Gly 


Pro Tyr 


Glu 


Lys 


Val 


Asn 


Gin 


Tyr 


Val 


Asn 


Val 


Gin 








395 










400 








405 


Arg 


Asn 


His 


Ser Glu 


Leu 


He 


Arg 


Arg 


He 


Gly 


Ala 


Asp 


Ser 


Thr 








410 










415 










420 


Val 


Leu 


Leu 


Lys Asn 


Asp 


Gly 


Ala 


Leu 


Pro 


Leu 


Thr 


Gly 


Lys 


Glu 








425 










430 






435 


Arg 


Leu 


val 


Ala Leu 


He 


Gly Glu 


Asp 


Ala 


Gly 


Ser 


Asn 


Pro 


Tyr 








440 








445 








450 


Gly Ala Asn 


Gly Cys 


Ser 


Asp 


Arg 


Gly Cys 


Asp 


Asn 


Gly Thr 


Leu 








455 








460 










165 


Ala 


Met Gly Trp Gly 


Ser 


Gly Thr Ala 


Asn 


Phe 


Pro 


Tyr 


Leu 


Val 








470 










475 








480 


Thr 


Pro 


Glu 


Gin Ala 


He 


Ser 


Asn 


Glu 


Val 


Leu 


Lys 


His 


Lys 


Asn 








485 










490 






495 


Gly 


Val 


Phe 


Thr Ala 


Thr 


Asp 


Asn 


Trp 


Ala 


He 


Asp 


Gin 


He 


Glu 








500 










505 








510 


Ala 


Leu 


Ala 


Lys Thr 


Ala 


Ser 


Val 


Ser 


Leu 


val 


Phe 


val 


Asn 


Ala 








515 










520 










525 


Asp 


Ser 


Gly Glu Gly 


Tyr 


He 


Asn 


Val 


Asp Gly Asn 


Leu 


Gly 


Asp 








530 










535 








540 


Arg 


Arg 


Asn 


Leu Thr 


Leu 


Trp 


Arg 


Asn 


Gly 


Asp 


Asn 


Val 


He 


Lys 








545 










550 










555 


Ala 


Ala 


Ala 


Ser Asn 


Cys 


Asn 


Asn 


Thr 


He 


Val 


Val 


He 


His 


Ser 








560 










565 










570 


Val 


Gly 


Pro 


Val Leu 


val 


Asn 


Glu 


Trp 


Tyr 


Asp 


Asn 


Pro 


Asn 


val 








575 










580 










585 


Thr 


Ala 


lie 


Leu Trp 


Gly 


Gly 


Leu 


Pro 


Gly 


Gin 


Glu 


Ser 


Gly 


Asn 








590 










595 








600 


Ser 


Leu 


Ala 


Asp Val 


Leu 


Tyr 


Gly Arg 


Val 


Asn 


Pro Gly Ala 


Lys 








605 










610 










615 


Ser 


Pro 


Phe 


Thr Trp 


Gly Lys 


Thr 


Arg 


Glu 


Ala 


Tyr 


Gin Asp 


Tyr 








620 










625 










630 


Leu 


Val 


Thr 


Glu Pro 


Asn 


Asn 


Gly Asn 


Gly 


Ala 


Pro 


Gin 


Glu 


Asp 








635 










640 










645 


Phe 


Val 


Glu Gly Val 


Phe 


He 


Asp Tyr 


Arg 


Gly 


Phe 


Asp 


Lys 


Arg 








650 










655 










660 


Asn 


Glu 


Thr 


Pro He 


Tyr 


Glu 


Phe 


Gly 


Tyr 


Gly 


Leu 


Ser 


Tyr 


Thr 








665 










670 










675 


Thr 


Phe 


Asn 


Tyr Ser 


Asn 


Leu 


Glu 


Val 


Gin 


val 


Leu 


Ser 


Ala 


Pro 








680 










685 










690 


Ala 


Tyr 


Glu 


Pro Ala 


Ser 


Gly 


Glu 


Thr 


Glu 


Ala 


Ala 


Pro 


Thr 


Phe 








695 










700 










705 


Gly 


Glu 


val 


Gly Asn 


Ala 


Ser 


Asp 


Tyr 


Leu 


Tyr 


Pro 


Ser 


Gly 


Leu 








710 










715 










720 


Leu 


Arg 


He 


Thr Lys 


Phe 


He 


Tyr 


Pro 


Trp 


Leu 


Asn 


Gly 


Thr 


Asp 








725 










730 










735 


Leu 


Glu 


Ala 


Ser Ser 


Gly 


Asp 


Ala 


Ser 


Tyr 


Gly Gin 


Asp 


Ser 


Ser 








740 










745 










750 


Asp Tyr 


Leu 


Pro Glu 


Gly Ala 


Thr 


Asp 


Gly 


Ser 


Ala 


Gin 


Pro 


He 
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755 










Leu 


Pro 


Ala 


Glv 
j 


Glv 


v^xy 


Pro 


ui y 


uj.y 










770 






Glu 


Leu 


He 


Ara 


Val 
785 


Ser 


Val 


I n r 


Tie 


Ala 


Gly 


Asp 


Glu 


Val 


Pro 


Gin 


Leu 


Tyr 










800 








Asn 


Glu 


Pro 


Lvs 


He 


Val 


Leu 


Arg 


xn 










815 








Gin 


Pro 


Ser 


Glu 


Glu 


Thr 


ys 


Trp 


C a v 










830 








Asp 


Leu 


Ala 


Asn 


Trp 


Asn 


Val 


Glu 


Lys 










845 








Ser 


Tyr 


Pro 


Lys 


Met 


Val 


Phe 


Val 


Gly 










860 








Pro 


Leu 


Arg 


Ala 


Ser 


Leu 


Pro 


Thr 


Val 



76 <> 765 



780 

Lys Asn Thr Gly L\ 3 Val 
7 ^0 795 
Val Ser Leu Gly G^y Pro 
805 810 
Phe Glu Arg He Thr Leu 
820 8 25 
Thr Thr Leu Thr Arg Arq 
835 840 
Gin Asp Trp Glu He Thr 
850 855 
Ser Ser Ser Arg Lys Leu 
865 870 
His 

875 880 



(2) INFORMATION FOR SEQ ID NO: 17 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 

<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO* 17 
His Asp Glu Leu 

(2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3338 

< fi ) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 

<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 • 



GAATTCCCGA 


TCCTATCTGT 


CACTTCATCA 


AAAGGACAGT 


AGAAAAGGAA 


50 


GGTGGCACTA 


CAAATGCCAT 


CATTGCGATA 


AAGGAAAGGC 


TATCGTTCAA 


100 


GATGCCTCTG 


CCGACAGTGG 


TCCCAAAGAT 


GGACCCCCAC 


CCACGAGGAG 


150 


CATCGTGGAA 


AAAGAAGACG 


TTCCAACCAC 


GTCTTCAAAG 


CAAGTGGATT 


200 


GATGTGATAT 


CTCCACTGAC 


GTAAGGGATG 


ACGCACAATC 


CCACTATCCT 


250 


TCGCAAGACC 


CTTCCTCTAT 


ATAAGGAAGT 


TCATTTCATT 


TGGAGAGGAC 


300 


AGGCTTCTTG 


AGATCCTTCA 


ACAATTACCA 


ACAACAACAA 


ACAACAAACA 


350 


ACATTACAAT 


TACTATTTAC 


AATTACAGTC 


GAGGGGATCT 


ATGGCGCGAA 


400 


AATCCCTAAT 


TTTCCCGGTG 


ATTTTGCTCG 


CCGTTCTTCT 


CTTCTCTCCG 


450 


CCGATTTACT 


CCGCCGGTCA 


CGATTACCGC 


GACGCTCTCC 


GTAAATCTAG 


500 


CATGGCTGAT 


GAATTGGCCT 


ACTCCCCACC 


GTATTACCCA 


TCCCCTTGGG 


550 


CCAATGGCCA 


GGGCGACTGG 


GCGCAGGCAT 


ACCAGCGCGC 


TGTTGATATT 


600 


GTCTCGCAAA 


TGACATTGGA 


TGAGAAGGTC 


AATCTGACCA 


CAGGAACTGG 


650 


ATGGGAATTG 


GAACTATGTG 


TTGGTCAGAC 


TGGCGGTGTT 


CCCCGATTGG 


700 


GAGTTCCGGG 


AATGTGTTTA 


CAGGATAGCC 


CTCTGGGCGT 


TCGCGACTCC 


750 


GACTACAACT 


CTGCTTTCCC 


TGCCGGCATG 


AACGTGGCTG 


CGACCTGGGA 


800 


CAAGAATCTG 


GCATACCTTC 


GCGGCAAGGC 


TATGGGTCAG 


GAATTTAGTG 


850 


ACAAGGGTGC 


CGATATCCAA 


TTGGGTCCAG 


CTGCCGGCCC 


TCTCGGTAGA 


900 


AGTCCCGACG 


GTGGTCGTAA 


CTGGGAGGGC 


TTCTCCCCAG 


ACCCTGCCCT 


950 


AAGTGGTGTG 


CTCTTTGCCG 


AGACCATCAA 


GGGTATCCAA 


GATGCTGGTG 


1000 


TGGTTGCGAC 


GGCTAAGCAC 


TACATTGCTT 


ACGAGCAAGA 


GCATTTCCGT_ 


1050 


CAGGCGCCTG 


AAGCCCAAGG 


TTTTGGATTT 


AATATTTCCG 


AGAGTGGAAG 


1100 



BNSDOCID: <WO 0136586A2_I_> 



WO 01/36586 PCT7IL00/00758 

13 



TGCGAACCTC 


GATGATAAGA 


CTATGCACGA 


GCTGTACCTC 


TGGCCCTTCG 


1150 


CGGATGCCAT 


CCGTGCAGGT 


GCTGGCGCTG 


TGATGTGCTC 


CTACAACCAG 


1200 


ATCAACAACA 


GTTATGGCTG 


CCAGAACAGC 


TACACTCTGA 


ACAAGCTGCT 


1250 


CAAGGCCGAG 


CTGGGCTTCC 


AGGGCTTTGT 


CATGAGTGAT 


TGGGCTGCTC 


1300 


ACCATGCTGG 


TGTGAGTGGT 


GCTTTGGCAG 


GATTGGATAT 


GTCTATGCCA 


1350 


GGAGACGTCG 


ACTACGACAG 


TGGTACGTCT 


TACTGGGGTA 


CAAACTTGAC 


1400 


CATTAGCGTG 


CTCAACGGAA 


CGGTGCCCCA 


ATGGCGTGTT 


GATGACATGG 


1450 


CTGTCCGCAT 


CATGGCCGCC 


TACTACAAGG 


TCGGCCGTGA 


CCGTCTGTGG 


1500 


ACTCCTCCCA 


ACTTCAGCTC 


ATGGACCAGA 


GATGAATACG 


GCTACAAGTA 


1550 


CTACTACGTG 


TCGGAGGGAC 


CGTACGAGAA 


GGTCAACCAG 


TACGTGAATG 


1600 


TGCAACGCAA 


CCACAGCGAA 


CTGATTCGCC 


GCATTGGAGC 


GGACAGCACG 


1650 


GTGCTCCTCA 


AGAACGACGG 


CGCTCTGCCT 


TTGACTGGTA 


AGGAGCGCCT 


1700 


GGTCGCGCTT 


ATCGGAGAAG 


ATGCGGGCTC 


CAACCCTTAT 


GGTGCCAACG 


1750 


GCTGCAGTGA 


CCGTGGATGC 


GACAATGGAA 


CATTGGCGAT 


GGGCTGGGGA 


1800 


AGTGGTACTG 


CCAACTTCCC 


ATACCTGGTG 


ACCCCCGAGC 


AGGCCATCTC 


1850 


AAACGAGGTG 


CTTAAGCACA 


AGAATGGTGT 


ATTCACCGCC 


ACCGATAACT 


190O 


GGGCTATCGA 


TCAAATTGAG 


GCGCTTGCTA 


AGACCGCCAG 


TGTCTCTCTT 


1950 


GTCTTTGTCA 


ACGCCGACTC 


TGGTGAGGGT 


TACATCAATG 


TGGACGGAAA 


2000 


CCTGGGTGAC 


CGCAGGAACC 


TGACCCTGTG 


GAGGAACGGC 


GATAATGTGA 


2050 


TCAAGGCTGC 


TGCTAGCAAC 


TGCAACAACA 


CAATCGTTGT 


CATTCACTCT 


2100 


GTCGGACCAG 


TCTTGGTTAA 


CGAGTGGTAC 


GACAACCCCA 


ATGTTACCGC 


2150 


TATCCTCTGG 


GGTGGTTTGC 


CCGGTCAGGA 


GTCTGGCAAC 


TCTCTTGCCG 


220O 


ACGTCCTCTA 


TGGCCGTGTC 


AACCCCGGTG 


CCAAGTCGCC 


CTTTACCTGG 


2250 


GGCAAGACTC 


GTGAGGCCTA 


CCAAGACTAC 


TTGGTCACCG 


AGCCCAACAA 


2300 


CGGCAACGGA 


GCCCCTCAGG 


AAGACTTTGT 


CGAGGGCGTC 


TTCATTGACT 


2350 


ACCGTGGATT 


TGACAAGCGC 


AACGAGACCC 


CGATCTACGA 


GTTCGGCTAT 


2400 


GGTCTGAGCT 


ACACCACTTT 


CAACTACTCG 


AACCTTGAGG 


TGCAGGTGCT 


2450 


GAGCGCCCCT 


GCATACGAGC 


CTGCTTCGGG 


TGAGACCGAG 


GCAGCGCCAA 


2500 


CCTTCGGAGA 


GGTTGGAAAT 


GCGTCGGATT 


ACCTCTACCC 


CAGCGGATTG 


2550 


CTGAGAATTA 


CCAAGTTCAT 


CTACCCCTGG 


CTCAACGGTA 


CCGATCTCGA 


2600 


GGCATCTTCC 


GGGGATGCTA 


GCTACGGGCA 


GGACTCCTCC 


GACTATCTTC 


2650 


CCGAGGGAGC 


CACCGATGGC 


TCTGCGCAAC 


CGATCCTGCC 


TGCCGGTGGC 


2700 


GGTCCTGGCG 


GCAACCCTCG 


CCTGTACGAC 


GAGCTCATCC 


GCGTGTCAGT 


2750 


GACCATCAAG 


AACACCGGCA 


AGGTTGCTGG 


TGATGAAGTT 


CCCCAACTGT 


2800 


ATGTTTCCCT 


TGGCGGTCCC 


AATGAGCCCA 


AGATCGTGCT 


GCGTCAATTC 


2850 


GAGCGCATCA 


CGCTGCAGCC 


GTCGGAGGAG 


ACGAAGTGGA 


GCACGACTCT 


2900 


GACGCGCCGT 


GACCTTGCAA 


ACTGGAATGT 


TGAGAAGCAG 


GACTGGGAGA 


. 2950 


TTACGTCGTA 


TCCCAAGATG 


GTGTTTGTCG 


GAAGCTCCTC 


GCGGAAGCTG 


3000 


CCGCTCCGGG 


CGTCTCTGCC 


TACTGTTCAT 


GATGAACTTT 


AACCCGGGCG. 


3050 


AGCTCGAATT 


GATCGTTCAA 


ACATTTGGCA 


ATAAAGTTTC 


TTAAGATTGA 


3100 
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GTTAAGCATG TAATAATTAA ACATGTAATG CATGACGTTA TTTATGAGAT 3200 
GGGGTTTTTA TGATTAAGAG TCCCCGCAAT TATACATTTT AATACGCGAT 3250 
AGAAAAACAA AATATAGCGC CCAAACTAAG GATAAAATTA TTCGCGCCGC 3300 
GGGGGGGCAT TCTATGGTTA CTAGATCTCT AGAATTCC 

<2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 883 

< B > : amino acid 

(C) STRAN: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Ala Arg Lys Ser Leu He Phe Pro Val He Leu Leu Ala Val 

Leu leu Phe Ser Pro Pro He Tyr Ser A^a Gly His Asp Tyr Arg 
20 25 30 

Asp Ala Leu Arg Lys Ser Ser Met Ala Asp Glu Leu Ala Tyr Ser 

Pro Pro Tyr Tyr Pro Ser Pro Trp Ala Asn Gly Gin Gly Asp Trp 

Ala Gin Ala Tyr Gin Arg Ala Val Asp He Val Ser Gin Met Thr 

65 70 75 

Leu Asp Glu Lys Val Asn Leu Thr Thr Gly Thr Gly Trp Glu Leu 

80 85 90 

Glu Leu Cys Val Gly Gin Thr Gly Gly Val Pro Arg Leu Gly Val 

95 100 105 

Pro Gly Met Cys Leu Gin Asp Ser Pro Leu Gly Val Arg Asp Ser 

1X0 115 120 

Asp Tyr Asn Ser Ala Phe Pro Ala Gly Met Asn Val Ala Ala Thr 

125 130 135 

Trp Asp Lys Asn Leu Ala Tyr Leu Arg Gly Lys Ala Met Gly Gin 

140 145 150 

Glu Phe Ser Asp Lys Gly Ala Asp He Gin Leu Gly Pro Ala Ala 

155 160 165 

Gly Pro Leu Gly Arg Ser Pro Asp Gly Gly Arg Asn Trp Glu Gly 

nv. . 170 175 ' 180 

Phe Ser Pro Asp Pro Ala Leu Ser Gly Val Leu Phe Ala Glu Thr 

185 190 195 

He Lys Gly He Gin Asp Ala Gly Val Val Ala Thr Ala Lys His 

200 205 210 

Tyr He Ala Tyr Glu Gin Glu His Phe Arg Gin Ala Pro Glu Ala 

215 220 225 

Gin Gly Phe Gly Phe Asn He Ser Glu Ser Gly Ser Ala Asn Leu 

23 ° 235 240 

Asp Asp Lys Thr Met His Glu Leu Tyr Leu Trp Pro Phe Ala Asp 

245 250 255 

Ala He Arg Ala Gly Ala Gly Ala Val Met Cys Ser Tyr Asn Gin 

260 265 270 

He Asn Asn Ser Tyr Gly Cys Gin Asn Ser Tyr Thr Leu Asn Lys 

275 280 285 

Leu Leu Lys Ala Glu Leu Gly Phe Gin Gly Phe Val Met Ser Asp 

290 295 300 

Trp Ala Ala His His Ala Gly Val Ser Gly Ala Leu Ala Gly Leu 

305 310 315 

Asp Met Ser Met Pro Gly Asp Val Asp Tyr Asp Ser Gly Thr Ser 

320 325 330 

Tyr Trp Gly Thr Asn Leu Thr He Ser Val Leu Asn Gly Thr Val 

335 340 345 

Pro Gin Trp Arg Val Asp Asp Met Ala Val Arg He Met Ala Ala 

350 355 360 

Tyr Tyr Lys Val Gly Arg Asp Arg Leu Trp Thr Pro Pro Asn Phe 

365 370 375 

Ser Ser Trp Thr Arg Asp Glu Tyr Gly Tyr Lys Tyr Tyr Tyr Val 

380 385 ' 390 

Ser Glu Gly Pro Tyr Glu Lys Val Asn Gin Tyr Val Asn Val Gin 

395 400 405 

Arg Asn His Ser Glu Leu He Arg Arg He Gly Ala Asp Ser Thr 

410 415 420 

Val Leu Leu Lys Asn Asp Gly Ala Leu Pro Leu Thr Gly Lys Glu 

425 430 435 

Arg Leu Val Ala Leu He Gly Glu Asp Ala Gly Ser Asn Pro Tyr 

440 445 _ 45 0 

Gly Ala Asn Gly Cys Ser Asp Arg Gly Cys Asp Asn Gly Thr Leu 

455 460 ' 465 



PCT/IL00/00758 



BNSDOCIO: <WO 0136586A2_I_> 



WO 01/36586 



PCT/ILOO/00758 



Ala 


Met 


Gly 


Trp 


Glv 


Ser Gly 


Thr 


Ala 


Asn 


Phe 




Tyr 


Leu. 


Val 


Thr 








470 










4 75 








a q r\ 
H OU 


Pro 


Glu 


Gin 


Ala 


He 


Ser 


Asn 


Glu 


Val 


Leu 


Lys 


His 


Lys 




Gly 








485 










4 90 








A Q<^ 


val 


Phe 


Thr 


Ala 


Thr 


Asp 


Asn 


Trp 


Ala 


He 


Asp 


Gin 


He 


Glu 


Ala 








500 










505 










Leu 


Ala 


Lys 


Thr 


Ala 


Ser 


Val 


Ser 


Leu 


Val 


Phe 


Val 


As n 


a .La 


Asp 


Ser 
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(2) INFORMATION FOR SEQ ID NO: 20 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO'20- 
CAGTGACCGT GGATGCGACA ATG 23 

<2) INFORMATION FOR SEQ ID NO: 21 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AGAGACGGAT GACAAGTACT ACTTGAAATT GGGCCCAAAA 4 0 

(2) . INFORMATION FOR SEQ ID NO: 22 

(i) • SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) type: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

iX2 ' ) SE 0UENCE DESCRIPTION: SEQ ID NO -22- 
CAGTGACCGT GGATGCGACA ATG 23 ' 

(2) INFORMATION FOR SEQ ID NO: 23 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 

< B > TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xx) SEQUENCE DESCRIPTION: SEQ ID NO'23- 
AAAGGATCCT TAGTGAACAG TAGGCAGAGA CGC 33 

(2) INFORMATION FOR SEQ ID NO:24 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 4 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO*24- 
His Asp Glu Leu * r 
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OO < 54 ) Title: ASPEG1LLUS NIGER BETA-GLUCOSIDASE GENE. PROTEIN AND USES THEREOF 
IT) 

(57) Abstract: A polypeptide having p-glueosidase enzymatic activity, a polynucleotide encoding the polypeptide, a nucleic acid 
constructs carrying the polynucleotide, transformed or infected cells, such as yeast cells, and transgenic organisms expressing the 
^ polynucleotide and various uses of the polypeptide, the polynucleotide, cells and/or organisms, including, producing a recombinant 
polypeptide having the p-glueosidase enzymatic activity, increasing the level of aroma compounds in alcoholic beverages, as well as 
other fermentation products of plant material, hydrolyzing cellohiose and thus increasing the level of fermentable glucose, increasing 
the production of alcohol, such as ethanol from plant material, increasing the aroma released from a plum or a plant product, and 
hydrolysis or iransglycosylalion of glycosides. 
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